Data Modeling Strategies for Connected Vehicle Signal Data in MongoDB
-btok720i3f.png)
Today’s connected vehicles generate massive amounts of data. According to an article from S&P Global Mobility, a single modern car produces nearly 25GB of data per hour. To put that in perspective: that’s like each car pumping out the equivalent of six full-length Avatar movies in 4K—every single day! Now scale that across millions of vehicles, and it’s easy to see the challenge ahead. Of course, not all of that data needs to be synchronized to the cloud—but even a fraction of it puts significant pressure on the systems tasked with processing, storing, and analyzing it at scale. The challenge isn’t just about volume. The data is fast-moving and highly diverse—from telematics and location tracking to infotainment usage and driver behavior. Without a consistent structure, this data is hard to use across systems and organizations. That’s why organizations across the industry are working to standardize how vehicle data is defined and exchanged. One such example is the Connected Vehicle Systems Alliance or COVESA, which developed the Vehicle Signal Specification (VSS)—a widely adopted, open data model that helps normalize vehicle signals and improve interoperability. But once data is modeled, how do you ensure it's persistent and available at all times in real-time? To meet these demands, you need a data layer that's flexible, reliable, and performant at scale. This is precisely where a robust data solution designed for modern needs becomes essential. In this blog, we’ll explore data strategies for connected vehicle systems using VSS as a reference model, with a focus on real-world applications like fleet management. These strategies are particularly effective when implemented on flexible, high-performance databases like MongoDB, a solution trusted by leading automotive companies. Is your data layer ready for the connected car era? Relational databases were built in an era when saving storage space was the top priority. They work well when data fits neatly into tables and columns—but that’s rarely the case with modern, high-volume, and fast-moving vehicle data. Telematics, GPS coordinates, sensor signals, infotainment activity, diagnostic logs—data that’s complex, semi-structured, and constantly evolving. Trying to force it into a rigid schema quickly becomes a bottleneck. That’s why many in the automotive world are moving to document-oriented databases. A full-fledged data solution, designed for modern needs, can significantly simplify how one works with data, scale effortlessly as demands grow, and adapt quickly as systems evolve. A solution embodying these capabilities, like MongoDB, supports the demands of complex connected vehicle systems. Its features include: Reduced complexity: The document model mirrors the way developers already structure data in their code. This makes it a natural fit for vehicle data, where data often comes in nested, hierarchical formats. Scale by design: MongoDB’s distributed architecture and flexible schema design help simplify scaling. It reduces interdependencies, making it easier to shard workloads without performance headaches. Built for change: Vehicle platforms are constantly evolving, and MongoDB makes it easy to update data models without costly migrations or downtime, keeping development fast and agile. AI-ready: MongoDB supports a wide variety of data types—structured, time series, vector, graph—which are essential for AI-driven applications. This makes it the natural choice for AI workloads, simplifying data integration and accelerating the development of smart systems. Figure 1. The MongoDB connected car data platform. These capabilities are especially relevant in connected vehicle systems. Companies like Volvo Connect use MongoDB Atlas to track 65 million daily events from over a million vehicles, ensuring real-time visibility at massive scale. Another example is SHARE NOW, which handles 2TB of IoT data per day from 11,000 vehicles across 16 cities, using MongoDB to streamline operations and deliver better mobility experiences. It’s not just the data—it’s how you use it Data modeling is where good design turns into great performance. In traditional relational systems, modeling starts with entities and relationships to focus on minimizing data duplication. MongoDB flips that mindset. You still care about entity relationships—but what really drives design is how the data will be used. The core principle? Data that is accessed together should be stored together. Let’s bring this to life. Take a fleet management system. The workload includes vehicle tracking, diagnostics, and usage reporting. Modeling in MongoDB starts by understanding how that data is produced and consumed. Who’s reading it, when, and how often? What’s being written, and at what rate? Below, we show a simplified workload table that maps out entities, operations, and expected rates. Table 1. Fleet management workload example. Now, to the big question: how do you model connected vehicle signal data in MongoDB? It depends on the workload. If you're using COVESA’s VSS as your signal definition model, you already have a helpful structure. VSS defines signals as a hierarchy: attributes (rarely change, like tank size), sensors (update often, like speed), and actuators (reflect commands, like door lock requests). This classification is a great modeling hint. VSS’s tree structure maps neatly to MongoDB documents. You could store the whole tree in a single document, but in most cases, it’s more effective to use multiple documents per vehicle. This approach better reflects how the data is produced and consumed—leading to a model that’s better suited for performance at scale. Now, let’s look at two examples that show different strategies depending on the workload. Figure 2. Sample VSS tree. Source: Vehicle Signal Specification documentation. Example 1: Modeling for historical analysis For historical analysis—like tracking fuel consumption trends—time-stamped data needs to be stored efficiently. Access patterns may include queries like “What was the average fuel consumption per km in the last hour?” or “How did the fuel level change over time?” Here, separating static attributes from dynamic sensor signals helps minimize unnecessary updates. Grouping signals by component (e.g., powertrain, battery) allows updates to be scoped and efficient. MongoDB Time Series collections are built for exactly this kind of data, offering optimized storage, automatic bucketing, and fast time-based queries. Example 2: Modeling for the last vehicle state If your focus is real-time state—like retrieving the latest signal values for a vehicle—you’ll prioritize fast reads and lightweight updates. Common queries include “What’s the latest coolant temperature?” or “Where are all fleet vehicles right now?” In this case, storing a single document per vehicle or update group with only the most recent signal values works well. Updating fields in place avoids document growth and keeps read complexity low. Grouping frequently updated signals together and flattening nested structures ensures that performance stays consistent as data grows. These are just two examples—tailored for different workloads—but MongoDB offers the flexibility to adapt your model as needs evolve. For a deeper dive into MongoDB data modeling best practices, check out our MongoDB University course and explore our Building with Patterns blog series. The right model isn't one-size-fits-all—it’s the one that matches your workload. How to model your vehicle signal data At the COVESA AMM Spring 2025 event, the MongoDB Industry Solutions team presented a prototype to help simplify how connected vehicle systems adopt the Vehicle Signal Specification. The concept: make it easier to move from abstract signal definitions to practical, scalable database designs. The goal wasn’t to deliver a production-ready tool—it was to spark discussion, test ideas, and validate patterns. It resonated with the community, and we’re continuing to iterate on it. For now, the use cases are limited, but they highlight important design decisions: how to structure vehicle signals, how to tailor that structure to the needs of an application, and how to test those assumptions in MongoDB. Figure 3. Vehicle Signal Data Model prototype high-level architecture. This vehicle signals data modeler is a web-based prototype built with Next.js and powered by MongoDB Atlas. It’s made up of three core modules: Schema builder: This is where it starts. You can visually explore the vehicle signals tree, select relevant data points, and define how they should be structured in your schema. Use case mapper: Once the schema is defined, this module helps map how the signals are used. Which signals are read together? Which are written most often? These insights help identify optimization opportunities before the data even hits your database. Database exporter: Finally, based on what you’ve defined, the tool generates an initial database schema optimized for your workload. You can load it with sample data, export it to a live MongoDB instance, and run aggregation pipelines to validate the design. Together, these modules walk you through the journey—from signal selection to schema generation and performance testing—all within a simple, intuitive interface. Figure 4. Vehicle signal data modeler demo in action. Build smarter, adapt faster, and scale more confidently Connected vehicle systems aren’t just about collecting data—they’re about using it, fast and at scale. To get there, you need more than a standardized signal model. You need a data solution that can keep up with constant change, massive volume, and real-time demands. That’s where MongoDB stands out. Its flexible document model, scalable architecture, and built-in support for time series and AI workloads make it a natural fit for the complexities of connected mobility. Whether you're building fleet dashboards, predictive maintenance systems, or next-gen mobility services, MongoDB helps you turn vehicle data into real-world outcomes—faster. To learn more about MongoDB-connected mobility solutions, visit the MongoDB for Manufacturing & Mobility webpage. You can also explore the vehicle signals data modeller prototype and related resources on our GitHub repository.