Our API supports sampling on all /results endpoints to provide a uniform sample of the dataset for efficient data analysis and visualization. This feature is particularly useful for use cases like charting (if you only have 4 000 pixels on the chart, then 10 000 datapoints are plenty), where analyzing a full dataset isn’t necessary. Sampling returns a subset (with a uniform distribution) of data, optimizing usage of very large results with reduced latency and costs. You can apply sampling to the following endpoints:

Example Sampling Request

import dotenv, os
from dune_client.types import QueryParameter
from dune_client.client import DuneClient
from dune_client.query import QueryBase

os.chdir("<path_to_your_dotevn_file>")

# load .env file
dotenv.load_dotenv(".env")
# setup Dune Python client
dune = DuneClient.from_env()

query_result = dune.get_latest_result_dataframe(
    query=3582296 # https://dune.com/queries/3582296 -> OHLC Price
    , sample_count = 5000
) 

print(query_result)

Sampling Parameters

sample_count

  • Type: integer
  • Description: Determines the number of rows to return as a sample from the result set. If the available dataset contains fewer rows than the specified sample_count, the entire dataset is returned.
  • Sampling is designed to provide a randomized subset of data, with each request potentially producing different outcomes.
  • When specifying sample_count (e.g., sample_count = 10000), the number is approximate. The actual number of rows returned may vary slightly (e.g., 10013, 10017), reflecting the probabilistic nature of the sampling process.
  • sample_count is incompatible with offset, limit, and filters parameters.
  • sample_count can be used with columns to specify which data fields to include in the sample.
  • Specifying a low sample_count relative to total rows, being probabilistic, may return 0 rows due to the probability based sampling calculation (e.g., 10 in 100,000, means each row has ~0.01% chance to be included)

Sampling Response