Our model required an additional column to be outputted, so I also changed a few lines in CSVExporter.Java. First, in the method writeCSVHeaders, I added the title of the column I wanted to add. Instead of manually add the column after Synthea runs, I was able to add to the existing code for column generation. This generates a CSV directory in outputs with different categories of outputs (patient, provider, condition, etc.) in their respective CSV files. I wanted the output to be CSV, so I set all of the exporter variables to be false and set to true in the synthia.properties file. My step by step process followed the quick start directions in the repository’s ReadMe, with the exception of changing a few lines in two files. The consumer of the data is an application that helps nurses manage their populations that have cancer. The population size needed to be roughly 200 patients. My assignment was to populate part of our graph database with mock patients that have different types of cancer. Since Synthea’s output is highly customizable, different test cases can be easily generated, again, all on the fly. After development, Synthea can help with load and stress testing. There is no need to pass around a large data file. This means data can be generated on the fly, in a variety of locations. Since Synthea is Gradle-based, multiple properties files can be used to differentiate between environments. Synthea equips teams with the tool to avoid delays. Getting data into multiple environments further complicates this problem. In the past, projects have been delayed because there was not access to real data, or enough data. In addition to security, synthetic data helps development progress without data-related blockers. There is a guide for creating your own modules as well, if a specific illness you want in model is not there. Though I was only interested patients with cancer, Synthea can generate data for over 90 different illnesses from dermatitis to PTSD to appendicitis to dementia. The output can be customized to specific illnesses, locations, and population sizes using the command-line arguments. Synthea has a straightforward CLI tool that interacts with Gradle tasks. The power of Synthea is in simplifying the data generation without compromising on its quality. Not using real PHI maintains the security and anonymity of our members. Though we have access to the data lake, for demonstration and testing purposes, it is best practice to use mock data. We pay for our health insurance, too, so it’s really a win-win. We use the data lake to learn about the health care system to make it easier and more affordable for everyone. At Optum, we store a large pool of PHI in our data lake. Real health care data, or protected health information (PHI) must be just that - protected. Add sensitive to the list of adjectives for a system like healthcare. It is of utmost importance for a complex, and highly connected system. Having access to realistic data is crucial for any kind of modeling. Synthea creates realistic patient data, including the patients’ heath records in a variety of formats, with varying levels of complexity. Synthea is an open-source, synthetic patient generator that models up to 10 years of the medical history of a healthcare system.
0 Comments
Leave a Reply. |