kjamistan: Data Security, Validation & Testing

Interested in securing your data science code or pipeline? Want to verify your models against malicious attacks or noise? Want to see how much information is leaked into your model and how to make your data processing pipeline more secure? We can help!

We've developed several libraries for adding noise, duplication and distortion to datasets. These tools, paired with data testing best practices (including standard unit testing, property-based testing and integration testing), can help harden your code and ensure your project is ready for release.

Why Test?

Testing machine learning models for things like adversarial input, feature and model extraction as well as privacy attacks is a useful way to gauge the security of your model. It will also drive the design of the API interface that others will use for this model. We offer the ability to test models in a black-box setting, similar to penetration testing for software.

Data validation and testing are essential for data processing pipelines, especially as enterprises release more data science code and models to the public internet or their customers. Most software engineering best practices are not implemented by data teams, leaving your product, data and customers open to attack.

Testing your code provides peace of mind for data scientists, developers and product owners. When testing is a normal part of the development and release process, there is a reduced chance of introducing a bug or worse. By doing so, you can avoid interruption of your normal business. We help you implement tests so you can have happy employees and worry-free launches.

Our Testing and Validation Services

We offer several services which benefit your team in regards to data validation and testing, including:

  • Pen-testing for Machine Learning Models
  • Privacy testing via de-anonymization and re-identification Attacks
  • Security Consulting on active Machine Learning projects
  • Fuzz testing Models & Data Science APIs
  • Developing Data Science Unit Tests
  • Consulting on best implementations for Data Validation
  • Stress and fuzz-testing your Data Science Pipeline or Machine Learning Models
  • Outlining a Data Testing and Validation Plan

We can also offer training and workshops for your team regarding data testing best practices. If you have another request, please feel free to reach out. We are happy to discuss other options that leverage our expertise and experience.