From 7 July 2026, every new car, van, truck, and bus registered in the European Union must ship with a camera watching the driver. The same regulation that requires that camera also makes the data needed to train it difficult to collect lawfully. That contradiction is the real engineering problem behind driver monitoring, and it is why synthetic data has moved from a convenience to a requirement.
This post explains the mandate, the privacy mechanics that create the data gap, the latency budget these systems run under, and how a synthetic driver-behavior dataset is built to close the gap. The dataset in question covers 4 distraction behaviors across varied cabin viewpoints, and it exists precisely because the real version of it is hard to assemble without running into the law.
What the Mandate Actually Requires
The relevant law is the EU General Safety Regulation, Regulation (EU) 2019/2144, often called GSR2. It phases in two related camera-based systems. Driver Drowsiness and Attention Warning (DDAW) detects fatigue from driving and steering patterns. Advanced Driver Distraction Warning (ADDW) goes further, using an in-cabin camera to detect when the driver's attention has left the road.
The timeline is staged. ADDW became mandatory for new vehicle types from 7 July 2024, and becomes mandatory for all newly registered vehicles from 7 July 2026. After that date, a manufacturer cannot register a new vehicle in the EU without a working driver-monitoring system on board. This is not a Euro NCAP star-rating incentive that manufacturers can opt out of by accepting a lower score. It is a homologation requirement: no compliant system, no registration.
| Requirement milestone | Date | Scope |
|---|---|---|
| ADDW mandatory for new vehicle types | 7 July 2024 |
New type approvals |
| ADDW mandatory for newly registered vehicles | 7 July 2026 |
All new EU registrations |
The detection target is specific. These systems must recognize that the driver is no longer attending to the road, in real time, reliably, across the full range of drivers and conditions a vehicle will encounter. A system that works for some faces, some lighting, and some seating positions but not others does not pass. That breadth requirement is where the data problem begins.
The Privacy Architecture That Creates the Data Gap
The reason ADDW is legally defensible at all is that it processes everything inside the car. A near-infrared camera watches the driver continuously, an embedded processor extracts head pose and gaze, and the only thing that leaves that processing step is a binary signal: attending to the road, or not. The raw video stream does not leave the vehicle. That edge-processing design is what lets manufacturers argue the system is a safety device rather than a surveillance device.
Here is the trap. The same privacy logic that makes deployment lawful makes training data scarce. Under GDPR, an image of a person's face is personal data, and in-cabin footage is about as sensitive as personal data gets. Collecting, storing, and reusing real driver video at the scale and demographic diversity a safety-critical model needs runs directly into biometric privacy constraints. Teams that do manage to collect real footage tend to end up with narrow data: a handful of consenting participants, a limited set of vehicles, a constrained range of lighting and behavior.
So the regulation simultaneously demands a model that generalizes across all drivers and conditions, and restricts the real-world data collection that would let you train such a model. This is not a gap that more careful data collection closes. It is structural. The lawful way to get broad, balanced, demographically varied in-cabin training data without harvesting real people's faces is to generate it.
That is the precise role synthetic data plays here. The value is not that synthetic frames look real. The value is that they sidestep the privacy constraint entirely while giving you control over coverage: behaviors, viewpoints, lighting, and driver variation, generated to a chosen distribution rather than scavenged from whatever footage you could lawfully obtain.
What the Dataset Covers
The synthetic driver monitoring dataset contains 1,356 annotated images, with bounding box annotations in normalized YOLO format. Every image is synthetically generated, so none of it carries the privacy load of real in-cabin footage.
| Coverage axis | Value |
|---|---|
| Images | 1,356 |
| Distraction classes | 4 |
| Resolution | Mixed |
| Annotation format | YOLO bounding boxes |
| Real driver footage used | 0% |
It targets 4 distraction behaviors as detection classes: drinking, yawning, calling, and texting. These are the concrete, observable actions that an ADDW-style system needs to distinguish from a driver who is simply attending to the road. Yawning is a fatigue signal that overlaps with the DDAW drowsiness target; calling and texting are the classic phone-distraction behaviors that dominate distracted-driving crash statistics; drinking is a hands-and-gaze-off-task behavior that is easy to describe and hard to collect at volume in the real world.
| Class | Catalog object count | Safety signal |
|---|---|---|
drinking |
478 |
Hands-and-gaze-off-task behavior |
yawning |
219 |
Fatigue/drowsiness proxy |
calling |
250 |
Phone-distraction behavior |
texting |
409 |
Phone-distraction behavior |
The most distinctive property of the dataset is viewpoint coverage. The images span multiple cabin camera views. This matters more than it might first appear. There is no single standardized mounting position for a driver-monitoring camera. Manufacturers place them on the steering column, the dashboard, the A-pillar, the rear-view mirror housing, and elsewhere, and each placement produces a different geometry of the driver's face and hands. A model trained on one viewpoint degrades on another. Building viewpoint variation into the dataset directly attacks the generalization requirement the regulation imposes.
The catalog records the dataset resolution as Mixed, reflecting varied cabin-facing frames around the driver's upper body and the area where hands, phone, and face interact. The annotations localize the behavior within that frame rather than merely classifying the image, which is what an object detection pipeline driving a real-time alert needs.
Why Viewpoint and Behavior Coverage Decide Reliability
The failure mode for these systems is not the average case. It is the edge. Euro NCAP, which sets the performance bar above the legal minimum, explicitly evaluates in-cabin monitoring across a wide matrix of conditions rather than best-case scenarios: varied lighting, partial occlusion, unusual seating postures, and diverse drivers. A system that scores well on a clean frontal view in good light and fails on an angled view of a driver wearing a cap is not a system that passes a serious evaluation.
This is where balanced synthetic coverage earns its place. A real dataset assembled from limited consenting participants will be skewed: toward whoever volunteered, toward whatever camera the collection rig used, toward the behaviors that happened to occur during recording. The rare-but-critical combinations, an unusual angle, an atypical posture, a less-represented demographic, will be thin or absent. Those are exactly the cases the model will face in production and exactly the cases that determine whether it passes evaluation.
Generating the data inverts that skew. Each behavior can be rendered across viewpoint and lighting variation, so the coverage floor is even rather than dictated by who showed up to a recording session. The 4-behavior structure of this dataset is a coverage grid, not a random sample.
| Coverage dimension | Count | Why it matters |
|---|---|---|
| Behaviors | 4 |
Separates fatigue and phone-distraction signals |
| Images | 1,356 |
Gives the detector repeated examples across cabin conditions |
| Classes | 4 |
Keeps the detection target specific enough for alerting |
The Latency Budget Nobody Mentions
There is a constraint on these systems that the regulatory discussion tends to skip: they have to run fast, on cheap hardware, forever.
A driver-monitoring model does not run in a data center. It runs on an embedded automotive processor, inside the car, every frame, for the life of the vehicle. The alert is only useful if it fires while the driver is still distracted, which means the full loop of capture, inference, and warning has to complete within a fraction of a second. A detector that needs a heavy GPU to hit its accuracy numbers is not a deployable detector, regardless of how good those numbers look on a benchmark.
This is why the dataset is annotated in YOLO format. The YOLO family of detectors is the standard choice for real-time, single-pass object detection precisely because it is fast enough to run on constrained hardware while remaining accurate enough to be useful. Shipping training data in the format the deployment architecture expects is a small thing that signals the dataset was built with the real-time, on-device constraint in mind rather than as an academic exercise.
The practical consequence for the data is that coverage quality matters more than raw volume. You cannot fix a fast, small model's blind spots by making it bigger; the hardware budget will not allow it. You fix them by making sure the training data covers the cases the model will actually see. Even coverage across behaviors and viewpoints is how you get a model that is both small enough to run on the vehicle's processor and reliable enough to pass.
| Evaluation metric | Catalog value |
|---|---|
| Model checkpoint | yolov8m.pt |
mAP@0.5 |
0.8306 |
mAP@0.5:0.95 |
0.4409 |
Precision |
0.8176 |
Recall |
0.7590 |
Where Synthetic Fits
The case for synthetic driver-monitoring data is narrower and more durable than a blanket claim that synthetic replaces real. Two things are true at once.
First, synthetic data is the only practical way to get broad, balanced, privacy-clean coverage for a system the law now requires. The regulation demands generalization across drivers, behaviors, and viewpoints; GDPR restricts the real collection that would provide it; synthetic generation resolves the contradiction. That is not a marketing claim, it is the structure of the problem.
Second, synthetic data is not a complete substitute for validation against reality. A driver-monitoring model headed for homologation and Euro NCAP testing should be validated against real held-out data, and most teams will benefit from mixing some real footage into training to close the residual domain gap. Synthetic builds the coverage floor and removes the privacy blocker on the bulk of the data. A measured amount of real data, collected with proper consent, anchors the model to the specific cameras and cabins it will ship in. Treating synthetic as the entire solution overstates what it can do alone.
The honest framing is that synthetic data makes a lawful, well-covered driver-monitoring dataset possible at all, and real validation makes it trustworthy. For a system that becomes mandatory across an entire continent in July 2026, both halves matter.
The dataset is open, YOLO-formatted, and available on Hugging Face. It is a useful starting point for any team building toward the GSR deadline that has run into the wall between what the regulation requires and what the privacy rules allow them to collect.
The Synthetic Driver Monitoring Detection dataset is available on Hugging Face and the AnywayLabs dataset catalog. EU regulatory details reference Regulation (EU) 2019/2144 and its ADDW provisions.



