Diveplane is a startup attempting to fix the privacy issues associated with data used for artificial intelligence training. The company this week launched what it says is the industry’s “first verifiable twin dataset.” The product called GEMINAI, is supposed to help organizations analyze sensitive datasets without risking the information being compromised.
GEMINAI creates a “twin dataset” that can be used for data modeling and analysis, that replaces any personally identifiable information with synthetic data. The company says they are able to maintain the statistical relationships of the original dataset. This is quite different from most privacy techniques, where slices of information, like names and social security numbers are masked, but the data still at risk of conversion to its original state.
Diveplane hopes GEMINAI will ensure businesses are able to comply with privacy laws and regulations like the E.U.’s General Data Protection Regulation and U.S. medical regulations like HIPAA.
“We love seeing AI increasingly adopted by many industries, but we’re finding that not all AI is created and trained equally,” said Dr. Michael Capps, CEO of Diveplane. “Many businesses are forced to use inaccurate or incomplete data to train their AI due to privacy requirements, which can lead to the AI making poor or misleading decisions. With GEMINAI, we’re eliminating that risk by creating a verifiable synthetic ‘twin’ of the dataset, so that businesses don’t need to sacrifice the quality of their AI for the sake of privacy.”
Diveplane says they already have interest in these “twin datasets” from industries constrained by privacy regulations. AI is becoming bigger in healthcare research, but has been constrained by HIPAA and other privacy requirements. Using GEMINAI, hospitals could share “truly anonymized patient records” with research organizations. This may mean more nuanced analyses of patient care, outside of traditional research settings, could be used in research. For example being able to isolate factors like diet and genetic markers on the results of cancer treatment.
It may not be all better medical care and rainbows. GEMINAI will also likely help to facilitate the multibillion-dollar industry that has sprung up around data trading. For example, letting companies anonymize information to sell it to advertisers, without any of the current repercussions about privacy. We may get a little more anonymised, but me may also all be a little more observed. Overall I think GEMINAI is an important step to AI being able to do something truly great for humanity.