Two Companies Roll Out Privacy Screens for ChatGPT
Sensitive corporate material is not protected in the large language model software.
The large language model AI system called ChatGPT, from the company Open AI, has seen adoption at a blinding rate. Commercial real estate, not known as living on the cutting edge of technology, has already started to employ it. For example, some apartment owners use it to generate marketing pitches or emails to prospects and residents.
But “blinding” applies in two ways. Speed, of course, but also a frequent blindness to what the software is capable of doing. There are issues like the inclination to make things up: facts, data, and sources. Or the potential for copyright infringement, as the tools have been trained on large bodies of works without having permission from all the rights owners to use the results in commercial applications.
One other major issue has been that as people use ChatGPT or the many products that connect to it for the AI processing, they are feeding material into the system. While Open AI has said that it will no longer use such data to train their system, they still hold it for at least 30 days. And intention doesn’t matter if an error or bug can leak information, such as happened in March when ChatGPT leaked conversation histories from users, according to a BBC report.
Two companies have created filters for personally identifiable information, or PPI, from customers or employees to keep it back from the AI system. This release PPI can breach data privacy laws and put a company into regulatory risk.
The first company is Private AI, which has been using machine learning (another part of AI techniques) to detect the presence of PPI like names, addresses, credit card numbers, birth dates, phone numbers, and more. The machine learning part helps the company improve over time its ability to recognize such data, even when out of databases but in documents, chats, and other freeform expressions of information.
Their new product is PrivateGP, written to help companies “safely leverage OpenAI’s chatbot without compromising customer or employee privacy,” according to a company press release. It redacts more than 50 types of PPI from the prompts users type to get a response from ChatGPT. When the answers come back, it reinserts the PPI.
“LLMs are not excluded from data protection laws like the GDPR, HIPAA, PCI DSS, or the CPPA,” the release quoted Patricia Thaine, Co-Founder and CEO of Private AI, as saying. “The GDPR, for example, requires companies to get consent for all uses of their users’ personal data and also comply with requests to be forgotten.” GDPR is the European data protection law. HIPAA is a U.S. law covering among other things healthcare data privacy. PCI DSS is the Payment Card Industry Data Security Standard, not a government standard but mandated by the payment card industry.
At the end of March, Cado Security announced its Masked-AI, an open-source library that developers can use to also redact PPI, storing it internally, substituting a placeholder, and then reversing the process when ChatGPT returns the result.
All well and good, and yet still insufficient for companies. Beyond PPI, companies have their own data that should remain private. PPI doesn’t include sales projections, strategic analyses, company secrets, or anything else that is proprietary and sensitive.