How to Access Tables

This tutorial will help you get started with querying the Bitpulse Data Catalog using BigQuery or via code.

Accesing Data Throught BigQuery Studio

Note: Before you can follow this tutorial, you must have already requested and been granted access to the Data Catalog. If you haven't done so yet, please visit the Access Permissions page to submit your access request.

After being granted access to the Bitpulse Data Catalog:

Log in to Google Cloud Console
Navigate to BigQuery Studio in the sidebar menu

BigQuery First Page

You should see the cripto-data project selected, if not, click on the name of the current project to select it
Create a new query clicking on the blue "+"

BigQuery Second Page

Create your query always using the dataset prefix "defi_prepared"
Run you query on the blue "Run" button
Save/Export you results

BigQuery Tutorial Third Page

Best Practices

Use Partitioning: Many tables are partitioned by block_timestamp. Include this field in your WHERE clause to improve performance.
Limit Results: Always use LIMIT in exploratory queries to avoid processing excessive data.
Join Efficiently: Use the integer IDs for joins rather than strings for better performance.
Check Data Freshness: Use queries like this to check the latest available data:
```
SELECT MAX(block_timestamp) FROM `bitpulse-analytics.defi_liquidity.block`;
```
Use Aggregation: Pre-aggregate data when possible to reduce the amount of data processed.

Accessing Data Programmatically

This section explains how to access the Bitpulse Data Catalog programmatically using various programming languages. Whether you're using Python, Node.js, Java, or another language, you'll need a Google Cloud service account with appropriate permissions.

Prerequisites

Before you begin, ensure you have:

A Google Cloud service account with BigQuery access permissions
The appropriate BigQuery client library for your programming language
Basic familiarity with SQL and your chosen programming language

Note: You must request access to the Data Catalog through our Access Permissions form. You'll need to provide your service account email during this process.

Creating a Service Account (If You Don't Have One)

Step 1: Create a Service Account in Google Cloud Console

Go to the Google Cloud Console → IAM & Admin → Service Accounts
Click "+ Create Service Account"
Enter:
- Service account name (e.g., bigquery-accessor)
- Description (optional)
Click "Create and continue"

Step 2: Create and Download a JSON Key

In the list of service accounts, find your new one
Click the ⋮ (three dots) → "Manage keys"
Under Keys, click "Add key" → "Create new key"
Choose JSON → "Create"
The key file will download automatically — it's your service account key
Save it securely

Important: Share the service account email (found in the key file) with us through the Access Permissions form. We'll grant this service account access to the Bitpulse Data Catalog.

Integration Examples

Python

Install the required packages:

pip install google-cloud-bigquery pandas

Create a Python script to authenticate and query data:

from google.cloud import bigquery
from google.oauth2 import service_account
import pandas as pd

# Path to your service account key file
KEY_PATH = "path/to/your-service-key.json"

# Create credentials using your service account key
credentials = service_account.Credentials.from_service_account_file(
    KEY_PATH, 
    scopes=["https://www.googleapis.com/auth/bigquery"]
)

# Initialize the BigQuery client
client = bigquery.Client(
    credentials=credentials,
    project=credentials.project_id
)

# Example query
query = """
SELECT id, slug, name
FROM `defi_prepared.network`
ORDER BY name;
"""

# Execute the query
df = client.query(query).to_dataframe()

# Display results
print(df)

Best Practices for Programmatic Access

Security

Store Credentials Securely:
- Never commit service account keys to version control
- Use environment variables or secure secret management solutions
Principle of Least Privilege:
- Request only the permissions your application needs
- Create separate service accounts for different applications/purposes

Performance

Query Optimization:
- Use parameter queries to prevent SQL injection and improve caching
- Filter on partitioned columns like block_timestamp when possible
- Limit result sets to only the data you need
- Use SELECT with specific column names instead of SELECT *
Resource Management:
- For large datasets, use streaming or chunking to process data efficiently
- Implement pagination for large result sets

Reliability

Error Handling:
- Implement comprehensive error handling and logging
- Use exponential backoff for retries on transient errors
- Handle API rate limits with appropriate retry mechanisms

Next Steps

Now that you're familiar with the basics, explore the Available Tables section to learn more about each table's schema and relationships.

For performance optimization tips when working with large datasets, check out our Performance Tips guide.

If you have any questions or need assistance, please reach out to us at support@bitpulse.io.

Accesing Data Throught BigQuery Studio​

Best Practices​

Accessing Data Programmatically

Prerequisites​

Creating a Service Account (If You Don't Have One)​

Integration Examples​

Python​

Best Practices for Programmatic Access​

Security​

Performance​

Reliability​

Next Steps​