Data Synthesis
Nothing beats real data, but sometimes you don’t have enough of it. We can help you synthesise data to improve your models and applications.
Some examples of data we’ve synthesised:
- PDFCrunch app - for a document layout detection project where the structure mattered more than the content, we synthesised thousands of documents with different layouts using dummy text. This allowed us to train a model that could generalise to new layouts, and achieve a high level of accuracy when extracting tabular data as CSVs.
- cytora.com - We built a synthesiser which could generate thousands of market representations (which feeds into policy crafting by the insurer), which we used to fine-tune PaLM2 (Google VertexAI) and smaller open source models to increase the accuracy of our predictions.
- all.health - we built a patient simulator with which we could generate patient profiles and continuous telemetry data that simulates a patient wearing all.health’s proprietary wristband.
We can also help set up systems with which you can gather more real-world data, e.g. an in-house data labelling platform, or a data collection app.