The Digital, Data-Driven Demonstration Farm

Research Update: Cotton Data Management

At a Glance

ETDS Progress

  • Research Topic: Data Management in Cotton Production
  • PI: Glen Rains
  • Team: Peter Ngimbwa (PhD)
  • Main Objective: Determine the most suitable NoSQL database for handling heterogeneous data generated in cotton production.

Motivation

  • Emerging technologies such as IoT, drones, and rovers in cotton production generate vast and heterogeneous data that requires efficient management. 
  • Traditional databases struggle to handle heterogeneous datasets due to rigid schemas.
  • NoSQL databases offer flexibility and scalability to handle heterogeneous data.

Proposed Solution

  • Use software performance testing techniques to evaluate theperformance of NoSQL databases, specifically MongoDB (a documentdatabase) and Neo4j (a graph database.
  • The performance evaluation focused on response time, CPUutilization, and memory consumption.

Results to Date

Graphic of databse workflow. The graphic shows that "data sources" (IoT: soil & weather, rover: image & point cloud, drone: imagery data, manual: height, boll count. nodes count, etc.) are ingested into a box called "data loader (JMeter Scripts)." The data then is inserted into both "Database (MongoDB) and "Database (Neo4j)." The data goes through "workload test" and goes into a single "performance testing (JMeter: CRUD, Queries)." The data then goes through "collect results" into "Metric collection (CPU, RAM, response time)." The save logs go into "Result storage (CSV)" and are analyzed into "visualization & analysis."
Database workflow for cotton data
  • MongoDB significantly outperforms Neo4j in responsetime.
  • Neo4j tends to use more CPU resources thanMongoDB; however, the difference is not significant.
  • MongoDB exhibits higher memory usage than Neo4j,but the difference is not significant.

Next Steps

  • Increase the dataset to strengthen the significance of the analysis.
  • Conduct the analysis across different hardware platforms to ensure a more robust comparison.
  • Incorporate a vector database into the evaluation for a more comprehensive comparison.​

Citation

Ngimbwa, P.C., Mwitta, C.J., Kiobia, D.O., Pengsheng, J., and Rains, G.C., 2025. A comparative analysis of document and graph databases for efficient data management in cotton production. 2025 Beltwide Cotton Conferences, New Orleans, LA.


Discover more from 4D Farm

Subscribe to get the latest posts sent to your email.