kjamistan: Data Validation & Testing

Interested in testing your data science code or pipeline? Want to verify your models against malicious attacks or noise? We can help!

We've developed several libraries for adding noise, duplication and distortion to datasets. These tools alongside data testing best practices (including standard unit testing, property-based testing and integration testing) can help harden your code and ensure your project is ready for release.

Why Test?

Data validation and testing are essential, especially as enterprises release more data science code and models to the public internet (or your customers). Most software engineering best practices are not implemented by data teams; leaving your product, customers and data open to attack.

Testing your code allows peace of mind for data scientists, developers and product owners. When testing is a normal part of development and release, there is a smaller chance of introducing a bug or even a large issue which can interrupt normal business or deter customers. We help you implement tests so you can have happy employees and worry-free launches.

Our Testing & Validation Services

We can offer several services which might benefit your team in regards to data validation and testing, including:

  • Test Framework and Methodology Design Document and Advising
  • Fuzz testing Models & Data Science APIs
  • Developing Data Science Unit Tests
  • Consulting on best implementations for Data Validation
  • Stress and fuzz-testing your Data Science Pipeline or Machine Learning Models
  • Outlining a Data Validation Plan

We can also offer training and workshops for your team regarding data testing best practices. If you have another request that you aren't sure fits, please feel free to reach out and we are happy to discuss options.