💡 Download the complete guide to AI-generated synthetic data!

Go to the ebook

>

Resources

>

Blog

The Synthetic Data Blog

If you want to learn about the latest developments in the synthetic data space, you are in the right place. The synthetic data blog covers the latest developments, research results and business best practices.

November 28, 2022

How to generate synthetic data from real data - zero to hero

High quality synthetic data, that is free to share and use across a wide variety of use cases is only a few clicks away. Simply follow our guide. 

Why removing Identical Matches in Synthetic Data risks Privacy: The Swiss Cheese Problem

TL;DR: Removing identical matches (as found in the original data) in a synthetic dataset does not protect, it actually leaks privacy. Synthetic Data While this blog post will cover a more advanced topic of synthetic data, let me briefly explain what synthetic data is, in case you’re new to it. Synthetic data - as we […]

Data Democratization at Scale: Announcing the MOSTLY AI and Databricks Integration

Today we’re excited to announce the new MOSTLY AI and Databricks integration. The collaboration between MOSTLY AI and Databricks is not just a partnership; it's a leap forward in how businesses innovate with data. By seamlessly integrating MOSTLY AI’s GenAI synthetic data generation with Databricks’ Data Intelligence Platform, this alliance redefines the boundaries of data […]

Data anonymization with Python

Pseudonymization vs Anonymization: ensure GDPR compliance and maximize data utility

The General Data Protection Regulation (GDPR) puts strict policies in place for dealing with personal data, or Personally Identifiable Information (PII). It can be intimidating to make sure you follow the right rules that apply to your dataset so you don’t run into expensive fines. Fortunately, the GDPR also provides exceptions to these demanding rules […]

February 29, 2024

Introducing the MOSTLY AI Synthetic Data Platform v200

We’re incredibly proud to share the highlights of our latest Synthetic Data Platform release - v200 - in this blog post! Over the past couple of months the team has been working extremely hard to incorporate a ton of feedback from our Enterprise Clients and FREE Version users. We’ve also made a significant effort in […]

December 13, 2023

Data bias types

Data bias in LLM and generative AI applications

What is data bias? Data bias is the systematic error introduced into data workflows and machine learning (ML) models due to inaccurate, missing, or incorrect data points which fail to accurately represent the population. Data bias in AI systems can lead to poor decision-making, costly compliance issues as well as drastic societal consequences. Amazon’s gender-biased […]

November 22, 2023

What is time series data and how to analyze it effectively

What is time series data? Time series data is a sequence of data points that are collected or recorded at intervals over a period of time. What makes a time series dataset unique is the sequence or order in which these data points occur. This ordering is vital to understanding any trends, patterns, or seasonal […]

November 22, 2023

Quality assurance for synthetic data tutorial

Synthetic data quality assurance with MOSTLY AI

This quality assurance for synthetic data tutorial gives you the tools to examine the quality of your synthetic data both for privacy and accuracy.

November 22, 2023

Data monetization

Data monetization: What is data monetization and how to do it well

The definition of data monetization Data monetization refers to converting data assets into revenue or value for an organization. The strategic practice of data monetization involves the collection, analysis, and sale of data to generate profits, improve decision-making, or enhance customer experiences. We can distinguish between internal and external data monetization, depending on where data […]

November 16, 2023

Data governance blogpost

Data governance & synthetic data: CIOs’ essential guide

Data governance is a data management framework that ensures data is accurate, accessible, consistent, and protected. For Chief Information Officers (CIOs), it’s also the strategic blueprint that communicates how data is handled and protected across the organization. Governance ensures that data can be used effectively, ethically, and in compliance with regulations while implementing policies to […]

1 2 3 … 12 Next »

Want to learn more about how synthetic data can help you?

Contact us Sign up