Adapted from Hugging Face's dataset creator skill for FF-Terminal, this skill provides comprehensive tools for creating, managing, and publishing datasets on the Hugging Face Hub. Specialized for agricultural and farm-related data including sensor readings, crop yields, weather data, soil measurements, and farming operations data. Supports multiple dataset formats and provides quality assurance features for data integrity.
slug hf_dataset_creator name Hugging Face Dataset Creator summary Create and manage datasets on Hugging Face Hub with support for agricultural data, sensor readings, and farm-related datasets description Adapted from Hugging Face's dataset creator skill for FF-Terminal, this skill provides comprehensive tools for creating, managing, and publishing datasets on the Hugging Face Hub. Specialized for agricultural and farm-related data including sensor readings, crop yields, weather data, soil measurements, and farming operations data. Supports multiple dataset formats and provides quality assurance features for data integrity. version 0.1.0 tags ["huggingface","datasets","agriculture","farming","data-management","iot","sensors","research"] triggers ["dataset","huggingface","agricultural data","farm data","sensor data","crop data","upload data","publish dataset","create dataset"] priority default assets [] recommended_tools ["analyze_data","write_file","run_command","tavily_search","generate_image_gemini"] Hugging Face Dataset Creator for FF-Terminal Overview This skill enables you to create, manage, and publish datasets on the Hugging Face Hub with a focus on agricultural and farm-related data. It provides tools for dataset initialization, configuration management, and efficient data uploads. When to Use This Skill Use this skill when you need to: Create new datasets for agricultural research or farm management Upload sensor data, crop yield data, weather measurements, or soil analysis results Structure farm operational data for machine learning projects Publish datasets for collaboration with the agricultural research community Manage existing datasets on the Hugging Face Hub Key Capabilities
api = HfApi() api.create_repo( repo_id= "your-username/farm-dataset-name" , repo_type= "dataset" , private= False , token=your_hf_token ) Step 2: Prepare Data Load and validate your agricultural data: import pandas as pd
df = pd.read_csv( "farm_sensor_data.csv" )
required_columns = [ "timestamp" , "sensor_type" , "value" , "location" ] assert all (col in df.columns for col in required_columns)
language: en tags:
This dataset contains sensor readings from agricultural IoT devices...
data/: Main data files in CSV formatmetadata/: Sensor configuration and location dataREADME.md: This file
"""
Step 4: Upload Dataset
Upload files to Hugging Face Hub:
from
huggingface_hub
import
upload_fileupload_file( path_or_fileobj= "farm_sensor_data.csv" , path_in_repo= "data/farm_sensor_data.csv" , repo_id= "your-username/farm-dataset-name" , repo_type= "dataset" , token=your_hf_token ) Best Practices for Agricultural Datasets Data Organization Use consistent column naming conventions Include proper units of measurement Add location metadata (GPS coordinates, field names) Document data collection methods and equipment used Include temporal information (growing season, planting dates) Metadata Standards Document data sources and collection methods Include information about sensor types and calibration Specify data frequency and resolution Add context about farming practices and crop varieties Provide data quality assessments and limitations Privacy and Security Remove or anonymize sensitive farm location data if needed Consider data licensing terms carefully Document any data restrictions or usage limitations Ensure compliance with agricultural data sharing agreements Integration with FF-Terminal Tools This skill works seamlessly with other FF-Terminal capabilities: Data Analysis : Use analyze_data tool to explore dataset patterns Visualization : Create charts and graphs for data insights Automation : Schedule regular data uploads and updates Monitoring : Set up alerts for data quality issues Troubleshooting Common Issues Authentication errors : Verify HF_TOKEN is set correctly Upload failures : Check file size limits and internet connection Format errors : Ensure data files match expected schema Permission issues : Confirm repository access rights Error Recovery Use api.repo_info() to check repository status Validate data locally before uploading Implement retry logic for network issues Keep backup copies of important datasets Example Use Cases
sensor_data = process_sensor_readings()
validated_data = validate_agricultural_data(sensor_data)
upload_to_huggingface(validated_data, "farm-iot-sensors" ) 2. Crop Yield Analysis Create datasets for yield prediction models:
yield_dataset = combine_farm_data( yield_data= "crop_yields.csv" , weather_data= "weather.csv" , soil_data= "soil_analysis.csv" )
create_ml_dataset(yield_dataset, "crop-yield-prediction" ) 3. Research Collaboration Share data with agricultural research community:
research_dataset = prepare_for_publication(raw_farm_data)
add_research_metadata(research_dataset, methodology_info)
publish_dataset(research_dataset, license= "CC-BY-SA" ) Advanced Features Version Control Track dataset versions with git-like functionality Maintain changelog of data updates and modifications Support for dataset branching and merging Automation Schedule regular data uploads and updates Automate data quality checks and validation Set up notifications for dataset changes Integration Connect with farm management software APIs Import data from agricultural equipment systems Export datasets in various formats for different tools Resources Hugging Face Datasets Documentation Agricultural Data Standards Farm Data Management Best Practices