(SENIOR) DATA SCIENTIST - NLP EXPERT
the tean is a technology company from Vienna, which is focused on the development of AI-based digital platforms. We have been recognized as a Top Company by kununu for the fourth consecutive year, placing us among the top five percent of the best employers.
To strengthen our team, we are looking for a (Senior) Data Scientist (full-time) with a focus on LLM (Large Language Models)
and Amazon SageMaker. The place of work is Vienna. There is also the possibility for remote work.
TASKS
Machine Learning & AI Fundamentals
- Machine Learning Algorithms: Experience with supervised and unsupervised learning algorithms, including linear regression, decision trees, clustering, and dimensionality reduction
- Deep Learning: Proficiency in neural networks, RNNs, CNNs, transformers, and other architectures common in LLMs
Natural Language Processing (NLP) and Large Language Models (LLMs)
- NLP Techniques: Understanding of tokenization, stemming, lemmatization, stop-word removal, and vectorization
- LLM Training and Fine-Tuning: Knowledge of transformer architectures (e.g., GPT, BERT) and experience in training/fine-tuning large models on custom datasets
- Prompt Engineering: Ability to design and experiment with effective prompts for task-specific language model responses
- Evaluation and Interpretation: Familiarity with BLEU, ROUGE, and other LLM evaluation metrics
Programming Skills
- Python: Proficiency in Python, especially libraries like Pandas, NumPy, and Scikit-learn for data manipulation and model building
- Deep Learning Frameworks: Hands-on experience with TensorFlow, PyTorch, and Hugging Face Transformers for building and fine-tuning language models
- SQL: Competence in SQL for data retrieval and manipulation
- Bash and Scripting: Useful for automation and handling cloud-related tasks
Data Engineering and Preprocessing
- ETL (Extract-Transform-Load) Skills: Experience with ETL processes to gather, clean, and preprocess data before model training
- Feature Engineering: Ability to extract and design features tailored to NLP tasks
MLOps and Model Lifecycle Management
- CI/CD for ML Models: Experience in setting up continuous integration/continuous deployment pipelines for machine learning workflows
- Having knowledge of Git and version-controlling
Amazon SageMaker Expertise (nice to have)
- Model Deployment: Knowledge of deploying machine learning models on SageMaker, including containerization)
- Model Monitoring: Understanding of SageMaker model monitoring for data drift, accuracy, and bias
- Integration with AWS Services: Experience with related AWS services (S3, Lambda, CodeCommit)
PROFILE
- Completed adequate university studies (for example informatics, data science, etc.)
- Experience in the areas listed above
- Structured work style with enjoyment in solving complex tasks
- Strong interpersonal skills and intercultural understanding
- Interest in working in interdisciplinary development teams
- Excellent knowledge of German and English
BENEFITS
- Flexible work locations (remote work)
- Flexible working hours
- Good work-life balance
- Friendly, ambitious team
- Regular team events (recently in Barcelona and Athens)
- Flat hierarchies and diverse development opportunities
- Room for self-development and own ideas
- Support for further training
- International network of experts
- Innovative projects related to AI and digital business
- Exciting and challenging tasks
- Great headquarters on Vienna's Stephansplatz
- Modern work equipment
- Commitment to animal welfare
The collective agreement for information and consulting applies to the advertised position. The minimum monthly salary is 3100 euros (full-time) gross. Depending on qualifications and professional experience, an overpayment is foreseen.
Are you interested? If so, please contact us with any questions and/or your application (cover letter, CV with photo, relevant certificates, etc.) exclusively in writing by e-mail to jobs@the-tean.com.
When applying, please indicate the platform through which you came across our job advertisement.
You will find information on the processing of your data under our privacy policy.