Skip to content
Skuto

Glossary

Training data

Training data is the huge collection of text, images and code an AI model learns from. With some chatbots, what you type can become training data for future models, unless you switch that off in the privacy settings.

An AI model doesn’t know anything by itself. It learned to write and answer by digesting enormous amounts of text: books, websites, code. That collection is the training data. The part that matters for you: some providers also use your conversations to train future models, depending on your plan and your settings.

Picture a bar owner pasting a supplier’s email into a chatbot to draft a reply. If the “improve the model” setting is on, that email, names, prices and all, may be used in training. It won’t pop out word-for-word for someone else, but it has left the building.

The fix is usually one switch, and it’s worth checking before you paste anything sensitive. Our paste checker shows how each major chatbot treats what you type, plan by plan.

Where you’ll meet this

  • ChatGPT → Settings → Data Controls → “Improve the model for everyone”
  • Claude → Settings → Privacy, where training preferences live
  • Gemini → your Google account’s Gemini Apps Activity page

See also opt-out and LLM.

Put it to work

← Back to the glossary