Wake Vision数据集助力微机器学习发展

The TensorFlow Blog 09月12日

Wake Vision是一个专为微机器学习领域设计的新数据集，旨在解决该领域缺乏大型、高质量数据集的问题。它包含约600万张图像，比之前的VWW数据集大近100倍，特别针对人检测任务。Wake Vision提供两种训练集版本：Wake Vision（大型）和Wake Vision（质量），分别侧重数据集大小和标签质量。研究表明，对于参数较少的微机器学习模型，高质量标签比大量数据更有益。Wake Vision还提供细粒度基准测试，评估模型在真实世界场景中的性能，如距离、光照条件、人物描绘、感知性别和年龄等。使用Wake Vision，模型准确率可提高6.6%，错误率从7.8%降至2.2%，并展现出在各种真实世界条件下的鲁棒性。Wake Vision网站还设有排行榜，供研究人员评估和比较模型性能。

📊 Wake Vision是一个包含约600万张图像的大规模数据集，专为微机器学习（TinyML）领域的人检测任务设计，其规模比之前的VWW数据集大了近100倍，为该领域提供了关键的高质量数据支持。

🔍 该数据集提供两种训练集版本：Wake Vision（大型）侧重于数据集的规模，Wake Vision（质量）则侧重于标签的质量，这种设计使得研究人员能够有效地探索数据集大小与质量之间的平衡。

📈 研究表明，对于参数数量较少的微机器学习模型，高质量的数据标签比单纯的大量数据更为重要，Wake Vision通过提供高质量标签，有效地提升了模型的训练效果和泛化能力。

🌐 Wake Vision不仅提供数据集，还提供了细粒度的基准测试，能够评估模型在真实世界应用中的表现，包括检测距离、光照条件、人物的不同描绘方式以及感知的性别和年龄等，有助于早期发现模型的偏见和局限性。

🏆 使用Wake Vision进行训练的模型取得了显著的性能提升，准确率比VWW数据集提高了6.6%，错误率从7.8%降低到2.2%，并在各种真实世界条件下展现出良好的鲁棒性，证明了该数据集的有效性。

Posted by Colby Banbury, Emil Njor, Andrea Mattia Garavagno, Vijay Janapa Reddi – Harvard University

TinyML is an exciting frontier in machine learning, enabling models to run on extremely low-power devices such as microcontrollers and edge devices. However, the growth of this field has been stifled by a lack of tailored large and high-quality datasets. That's where Wake Vision comes in—a new dataset designed to accelerate research and development in TinyML.

Why TinyML Needs Better Data

The development of TinyML requires compact and efficient models, often only a few hundred kilobytes in size. The applications targeted by standard machine learning datasets, like ImageNet, are not well-suited for these highly constrained models.

Existing datasets for TinyML, like Visual Wake Words (VWW), have laid the groundwork for progress in the field. However, their smaller size and inherent limitations pose challenges for training production-grade models. Wake Vision builds upon this foundation by providing a large, diverse, and high-quality dataset specifically tailored for person detection—the cornerstone vision task for TinyML.

What Makes Wake Vision Different?

Wake Vision is a new, large-scale dataset with roughly 6 million images, almost 100 times larger than VWW, the previous state-of-the-art dataset for person detection in TinyML. The dataset provides two distinct training sets:

Wake Vision (Large):

Wake Vision (Quality):

Wake Vision's comprehensive filtering and labeling process significantly enhances the dataset's quality.

Why Data Quality Matters for TinyML Models

In traditional overparameterized models, it is widely believed that data quantity matters more than data quality, as an overparameterized model can adapt to errors in the training data. But according to the image below, TinyML tells a different story:

The figure above shows that high-quality labels (less error) are more beneficial for under-parameterized models than simply having more data. Larger, error-prone datasets can still be valuable when paired with fine-grained techniques.

By providing two versions of the training set, Wake Vision enables researchers to explore the balance between dataset size and quality effectively.

Real-World Testing: Wake Vision's Fine-Grained Benchmarks

Unlike many open-source datasets, Wake Vision offers fine-grained benchmarks and detailed tests for real-world applications like those shown in the above figure. These enable the evaluation of model performance in real-world scenarios, such as:

Distance:

Lighting Conditions:

Depictions:

Perceived Gender and Age:

These benchmarks give researchers a nuanced understanding of model performance in specific, real-world contexts and help identify potential biases and limitations early in the design phase.

Key Performance Gains With Wake Vision

The performance gains achieved using Wake Vision are impressive:

Up to a 6.6% increase in accuracy

Error rate reduction from 7.8% to 2.2%

Robustness across various real-world conditions

Furthermore, combining the two Wake Vision training sets, using the larger set for pre-training and the quality set for fine-tuning, yields the best results, highlighting the value of both datasets when used in sophisticated training pipelines.

Wake Vision Leaderboard: Track and Submit New Top-Performing Models

The Wake Vision website features a Leaderboard, providing a dedicated platform to assess and compare the performance of models trained on the Wake Vision dataset.

The leaderboard enables a clear and detailed view of how models perform under various conditions, with performance metrics like accuracy, error rates, and robustness across diverse real-world scenarios. It’s an excellent resource for both seasoned researchers and newcomers looking to improve and validate their approaches.

Explore the leaderboard to see the current rankings, learn from high-performing models, and submit your own to contribute to advancing the state of the art in TinyML person detection.

Making Wake Vision Easy to Access

Wake Vision is available through popular dataset services such as:

TensorFlow Datasets (TFDS)

Hugging Face Datasets

Edge AI Labs

With its permissive license (CC-BY 4.0), researchers and practitioners can freely use and adapt Wake Vision for their TinyML projects.

Get Started with Wake Vision Today!

The Wake Vision team has made the dataset, code, and benchmarks publicly available to accelerate TinyML research and enable the development of better, more reliable person detection models for ultra-low-power devices.

To learn more and access the dataset, visit Wake Vision’s website, where you can also check out a leaderboard of top-performing models on the Wake Vision dataset - and see if you can create better performing models!

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑