NLP DataOps Engineer

Samsung R&D Institute Poland

  • Kraków

    Kraków, Lesser Poland
  • offer expired over a month ago
  • contract of employment
  • full-time
  • specialist (Mid / Regular)
  • hybrid work

Technologies we use

Expected

  • Python

  • DevOps (Linux, bash, git, Jenkins, Docker, Openstack, nginx, Ansible)

  • Databases (PostgreSQL, InfluxDB)

  • Data Engineering & Data Science

  • Data Visualization and dashboarding tools

  • Natural Language Processing

About the project

We invite you to the one of the largest speech and language processing teams in Europe. We work closely with other R&D teams to develop and test our next-generation personal Intelligent Assistant. In our lab engineers, researchers, and linguists work together on innovative products for the multilingual European market. We define the way users access, explore and interact with devices, knowledge, information, and services. With us you have unique opportunity to work on product available on a wide range of devices and used by millions of users.

Your responsibilities

  • Development and maintenance of data processing pipelines used for language analytics tasks.

  • Development and maintenance of dashboards and internal web services to present, access, annotate text or visualize usage data related to Voice Assistant.

  • Management of Linux servers used for data acquisition and processing.

  • Automation of repetitive tasks for Natural Language Processing (NLP), such as: retrieval of text data, text corpora management, text corpora annotation.

  • Exploration of available text data, to create meaningful reports (e.g. trends report, usage patterns report) and define metrics (e.g. end to end success rate) for other development teams.

  • Significant influence on the direction of work in the team, opportunity to participate in creation of project proposals, research and patent applications (especially in the field of data processing and analytics).

  • Significant impact on technological stack: this is R&D team and we can decide what technologies we use more freely than regular development teams.

Our requirements

  • Bachelor's or master's degree in Computer Science, Mathematics, Telecommunications or related fields.

  • Proficiency in Python.

  • Practical knowledge of the Linux environment.

  • Experience in Git, Github, Jenkins, Grafana, Docker or similar tools.

  • Knowledge of English at a level that allows for easy communication.

  • Creativity, ability to adapt knowledge to create innovation and open-mind is a plus.

Optional

  • Previous experience in Natural Language Processing or Text Processing related project. Ability to apply statistical methods and make inferences from data.

  • Experience in databases (especially Postgresql, InfluxDB).

  • Experience in any subdomain of Natural Language Processing (text classification, corpus linguistics, text analysis, sentiment analysis, information extraction).

  • Practical knowledge in Data Engineering and/or Data Science.

  • Experience in human-computer interaction application development text or voice (Chatbot development, voice assistant, messenger bot, Alexa Skills development, Google Assistant Actions development etc.).

  • Ability to use data visualization and dashboarding tools in Python in practice.

What we offer

  • Friendly working atmosphere.

  • Wide range of trainings (technical / soft-skills / e-learning platform).

  • Opportunity to work in multiple projects.

  • Multidisciplinary and multicultural team.

  • Working with the latest technologies on the market.

  • Monthly integration budget.

  • Possibility to attend local and foreign conferences.

  • Opportunity to participate in science research (scientific papers, project proposals, patents applications, development of own side-projects).

Benefits

  • sharing the costs of sports activities

  • private medical care

  • sharing the costs of foreign language classes

  • life insurance

  • corporate products and services at discounted prices

  • integration events

  • dental care

  • no dress code

  • leisure zone

  • pre-paid cards

  • redeployment package

  • baby layette

  • employee referral program

  • charity initiatives

  • unlimited free access to Copernicus Science Center

  • mentoring program

  • psychological support

  • possibility to test new Samsung products

  • work in Korea as a part of our Mobility Program

Technologies in use

  • Python

  • DevOps (Linux, bash, git, Jenkins, Docker, Openstack, nginx, Ansible)

  • Data Engineering & Data Science (variety of libraries for training & test data collection, data augmentation, text corpus processing)

  • Databases (PostgreSQL, InfluxDB)

  • Data Visualization and dashboarding tools (Voila, Dash, Grafana, Flask, Jupyter, Python visualization stack)

  • Natural Language Processing (text classification, word & sentence embeddings, named entity recognition, information extraction, evaluation of machine learning models, sentiment analysis, deep learning methods)

Samsung R&D Institute Poland

If you share our faith in the power of technology that changes reality, you work with passion, you have a curiosity about the world and you still want to learn - this is the place for you, and we know what types of working conditions to create to foster your development. We are looking for people who can turn bold visions of the future into projects and products that will serve millions of people around the world.

Scroll to the company’s profile