InData Labs is a data science firm and AI-powered solutions provider with its own R&D center. Our main focus lies on machine learning and deep learning solutions, as well as building high-load data processing systems.
Currently, we are looking for a Data Scientist / Machine Learning Engineer who will be a part of the general-purpose data science team and work with tasks covering a wide variety of business needs without focus on computer vision.
On this position, you will work with multiple data sources (usually numerical, textual and time-related data, less frequently visual data), huge and small datasets to develop, validate and deploy machine learning models, tune their performance & integrate them into data processing pipelines.
Deal with both structured and unstructured data, collaborate with data engineers on defining data storage formats, state data collection requirements;
Not only solve technical tasks but understand business needs and offer appropriate solutions, data collection and labelling requirements and recommendations, while describing a chosen approach to non-technical people;
Set up reproducible experiments: selection, training, validation and optimization of machine learning models, evaluation of their quality in business-related terms;
Integrate data preprocessing and model inference into general data processing pipelines;
Research new tools, papers, etc. in machine learning area.
Strong knowledge and deep understanding of
Сlassical machine learning (linear models, decision trees, ensembles for classification and regression tasks, clustering and dimensionality reduction)
Main concepts and stages of modelling process (validation scheme, regularization, overfitting and generalization, data leaks, feature selection, etc.)
Hands-on experience with Python scientific and ML-related libraries
Hands-on experience with gradient boosting libraries (xgboost, lightgbm or catboost)
Experience with relational databases and SQL
Ability to implement space and time-efficient algorithms and understand which one is preferable and when
Good Python programming skills
Data visualization and presentation skills;
Good spoken and written English (at least B1);
Ability and desire to convert raw business requests into strictly formulated machine learning tasks
Ability to formulate data gathering (or data labelling) requirements
Minimum 1-year experience in machine learning
Would be a plus:
Hands-on experience with developing parallel code in Python
Familiarity with non-relational databases (Cassandra, Elasticsearch, MongoDB, etc)
Experience in software engineering, deployment and integration with data delivery systems and other components, building microservices, providing APIs for models access
Experience in developing recommender systems, time series analysis
Experience in Natural Language Processing
Experience in Deep Learning with applications to any data domain
Experience in data labelling process setup using third-party or self-made labelling tools
Participation in ML competitions (Kaggle, etc)
Masters, PhD, or equivalent experience in Mathematics or Computer Science.
You will work with smart people who love to solve hard problems, and who not only expect but also foster high performance!