Maxim Moinat is a data engineer with a demonstrated history of working in bioinformatics and medical informatics. He has worked as a data engineer/software developer for The Hyve since 2016, and recently became a scientific researcher for the Erasmus MC medical informatics department. He has been a long-time collaborator with both the OHDSI and EHDEN communities.
Maxim earned a 2021 Titan Award honoree for his invaluable contributions in data standards. He leads the Registry workgroup, contributes to open-source development and has provided tutorials and other presentations during global community calls. He said his passion is to apply his skills to advance biomedical science, ultimately improving healthcare for many patients.
Maxim recently shared several insights in our latest Collaborator Spotlight.
Can you discuss your background, and how you got into your career as a data engineer?
In 2015, I finished my Master in Bioinformatics and Medical Science, where I learned to program and began to apply what I have learned to clean up messy data. The first datasets I worked with were actually in another domain, namely sports data. Together with a friend, we harmonised and analysed large amounts of track and field results. This led to my first and (so far) only scientific publication. Many of the skills I learned there, and the challenges I faced, I now apply in harmonising health data to the OMOP CDM at The Hyve. In 2016, I started at The Hyve with a main focus on OHDSI. Last year, in 2021, I also started a part-time PhD at the Erasmus MC medical informatics department. For me these are very exciting times with exciting academic and commercial projects.
You have been very active with EHDEN, which just completed a remarkable third year. What do you do with the collaborative, and what are you most proud about from last year?
My focus in EHDEN is on the OHDSI ETL tooling: WhiteRabbit, Rabbit in a Hat and Usagi. The Hyve maintains the OHDSI ETL tools and I am part of the faculty for the on-site training for the EHDEN partners, specifically focusing on the Rabbit tools. Besides that, I am co-managing the technical work package and helping out with reviewing data partner workplans and milestones.
I am very proud that we now have 143 data partners in EHDEN and specifically proud of the collaboration with data partners in the COVID-19 Rapid Data Partner call. In this data partner call, I was part of the EHDEN team that intensively collaborated with the data partners to convert their hospital records to the OMOP CDM. Although this was intensive, I enjoyed the interaction with the data partners a lot. In the first studies as part of the Evidence-a-thon, we already showed that this data can be used for calculating background rates of adverse events of special interest.
Data from registries typically offers very rich information on one or more related diseases. This complements the disease-agnostic EHR and claims data. Registry data is often smaller and more standardised than EHR data, but I learned that they can be extremely diverse and complex. Registry data is collected with a certain purpose and an underlying protocol that you need to take into account. This often means different, typically disease specific, codings are used that are hard to map lossless to existing standard OMOP concepts.
Also, I have become more aware that the OMOP CDM was designed for longitudinal, routinely collected EHR and claims data. Many registries are cross-sectional and are missing event dates that are required in the OMOP CDM; for longitudinal registries only a relative date/time might be given. The goal of the Registry WG is to make the OHDSI community aware of these challenges. With this awareness, we can make conventions that should be followed when converting registry data to the OMOP CDM.
In the future, maybe we would approach this the other way around: use the interoperability provided by the OMOP CDM to facilitate the data collection from multiple centres into disease registries. This is currently out of scope for the Registry WG, but is a future area where a big impact can be made.
You were honored with the 2021 Titan Award for Data Standards. What did this honor mean to you, and what inspires you about being part of the OHDSI community?
The Titan Award was a great surprise and made me feel more empowered. I am still very much at the beginning of my career and I feel I am very slowly starting to make meaningful contributions to the OHDSI community. The recognition really encourages me and motivates me to be even more active in OHDSI.
The OHDSI community has always been very welcoming. This is such an ambitious group of skilled people, who always try to make time to guide newcomers; it is remarkable. Personally, I have grown to love the field of observational health research. It combines my passion for cleaning messy data and my background in medical science. OHDSI has even inspired me to go back to academia and pursue a (part-time) PhD in medical informatics!
You do a great deal of work with open-source tools at both The Hyve and within OHDSI. How important is the concept of transparency and collaboration to you, and what are the specific tools you focus on within OHDSI?
Open-source and open-science is embedded in the foundations of The Hyve. Our mission is to “enable open science by developing and implementing open-source solutions and FAIRifying data in life sciences.” We are active partners in multiple open-source communities, like RADAR-base, cBioPortal and OpenTargets.
I personally enjoy that collaborating widens your world beyond your own organisation. I get a lot of inspiration and energy from communicating with others that are working on similar projects, and might have very different ideas. Also, it is very satisfying to see your work is being used, and appreciated, by multiple people all over the world.
The Hyve is currently the maintainer of the ‘Rabbit Suite’: WhiteRabbit, Rabbit-in-a-Hat and Usagi. Over the last years we have developed new features for these tools and fixed issues reported by the community. That said, I have to give credit to Martijn Schuemie for designing these tools. We are merely improving upon already great tools!
The Hyve was announced as a subcontractor with Erasmus MC as part of the DARWIN EU Project, and EHDEN’s data network will likely have an impact on the project. What do you think is the importance of this project, and what in particular are you looking forward to contributing to Darwin EU?
DARWIN EU is a huge step on making the efforts and successes from EHDEN sustainable for after the project. The goals of both projects are different, EHDEN goals are wider in terms of evidence generation, but both can learn from each other and benefit from developments in the OHDSI community.
Personally, I am really excited to see the OMOP CDM taking root in Europe. DARWIN EU can provide a stamp of approval that, when used correctly, real world data is a valid and reliable source of evidence. I am convinced that this will advance the spread of the OMOP CDM even more. This means more people might start using the OHDSI ETL tooling, and likely spark new developments in this area. We have already seen organisations like The Hyve developing new ETL tools for OHDSI.
I like exploring outdoors, be it hiking, running or cycling. What I particularly enjoyed pre-pandemic is to record a run in any new places where work trips brought me. For instance, at the last in-person global OHDSI Symposium, I followed the Rock Creek in Washington as far as my legs wanted to take me!
One interesting thing: my partner and I have a small herd of pets. We recently took in a puppy, to complement our cat, chicken and guinea pigs. Luckily my partner is a vet and makes sure they get the best care! And from her I am actually learning how a physician uses a (veterinary) EHR system!