Organizing Datasets

Tag, Filter, Sort, Import, Freeze and Split

Using Tags to Organize Your Dataset

Kiln uses tags to organize your dataset. You can add tags to any run/sample, and then filter by tag. This is a great way to organize your dataset and find specific runs.

Some examples of how you might use tags within a team:

  • Working with eval teams:

    • Add the "needs_review" tag when data is ready for review by a human eval team

    • Ask a human eval team to review the new batch of synthetic data. New synth data is automatically tagged with "synthetic" and "synthetic_session_id".

  • Defining a "golden" dataset: Have QA tag a "golden" data reserved for evals

  • Bug resolution: QA can tag examples of a common issue with a tag (e.g. "issue_unprofessional_tone"). Data scientists can run evals of different methods of fixing the issue.

  • Regression Testing: tag important customer use cases with a tag ("customer_use_case") and run evals to ensure the model doesn't regress on them prior to a new release.

  • Fine-tuning: exclude tags from fine-tuning datasets (golden, customer_use_case, etc), to prevent contamination.

Sort and Filter

The dataset view offers a number of tools that make working with large datasets easier:

  • Filter: Tap the filter button () to filter to specific tags

  • Sort: You can sort by any column by clicking its header

  • Multi-select: you can enter "selection" mode by clicking the select button

    • Select any row by clicking it

    • Select a range of rows by clicking the first, then holding shift while clicking the last

Batch Editing

Once you have selected rows you can perform a number of batch actions:

  • Add tags

  • Remove tags

  • Delete dataset items

Importing Data into you Dataset

If you already have a dataset, it's easy to import it into Kiln. Open the dataset tab, then click "Upload File" to add your data.

The format must be a CSV file with a header row. The following columns are supported:

  • input [Required] - The input to the task. If the task has an input schema, this must be a JSON string confirming to that schema.

  • output [Required] - The output of the task. If the task has an output schema, this must be a JSON string confirming to that schema.

  • reasoning [Optional] - If you model is a reasoning model that output reasoning/thinking text before the output (for example, R1, QwQ, etc), you can provide that text here. This will be visibile in the UI, and availalbe for fine-tuning a reasoning model.

  • chain_of_thought [Optional] - If you model output chain-of-thought text before the output, you can provide that text here. This will be visibile in the UI, and availalbe for fine-tuning a thinking model.

  • tags [Optional] - comma separated string listing the tags you want to add to this row. For example: tag1, tag2.

The CSV Import UI

If you prefer working in python, or have a complex import use case, our Python SDK can be used to add data to a Kiln project. It includes validators that ensures your data conforms to the needed schemas.

See our python docs for an example.

Dataset Splits

When creating a fine-tune, you can define a "dataset split". This is a frozen subset of your data.

  • Dataset splits may be broken into sub-sets like "train", "validation" and "test" which are useful for systematically training and evaluating models.

  • Dataset splits will randomly assign items between sub-sets (train/test/val), but the assignment is static. Items do not shift between subsets once the dataset split is created.

  • Dataset splits to not grow/change when you add new data. They are frozen at the point in time when they are created. This makes it easier to run multiple experiments (fine-tunes, evals, etc) on exactly the same training/eval datasets.

Last updated