About
Surprisingly I was studied Physics at the University. Due to lack of a computing infrastructure, I began to set up a mini HPC (5-nodes) for academic staff, at Ferdowsi University of Mashhad. Users were working on the applications of ML/DL in the Physics; Object classification in Astronomy and Track Reconstruction in HEP. Soon after, modules needed for simulations in astronomy, chemistry, condensed matter have been built.
After that, I did R&D at the GreenWeb and MLOps at the Ferdowsi-Cloud, to design and build Big-Data and ML microservices.
Currently, I have been remotely implementing kubernetes infrastructure in order to have a smooth operation of cloud-based microservices at payever. Meanwhile, as a part-time Linux System Engineer I am working on Software-Defined Storage (SDS) solution at Part Software Group. We aim to develop an open-source SDS to be deployed on our VMware infra and by KVM hypervisor.
Experience
2022-current ↴
I have responsibilities at both infrastructure/network and devOps teams. Maintenance and supporting virtualization infra which is based on VMware stack as a sysAdmin; Developing an open-source software-defined storage solution to provide file/block/object storage. This SDS is mainly being used by database team and new virtualization infra with KVM at heart.
2019-2022 ↴
As a HPC sysAdmin I have installed and set up a CentOS-based Cluster for Parallel and High-Performance Computing use cases at Ferdowsi University of Mashhad. It has a Queueing System based on SLURM and is stable enough that only required support is registering new account; Users just load required modules then execute their code.
Additionally, I have deployed an On-Demand Dashboard for users who need Desktop Container, Jupyter, VSCode, Mathematica.
2020-2021 ↴
During this period I was a MLOps engineer at GreenWeb and Ferdowsi-Cloud. My task was to containerise Machine Learning and Data Analysis tools using Microservices Architecture (GPU-as-a-service); And set up an Account Manager panel using Docker Swarm.
Skills
vSphere & ESXi - QEMU & KVM - K8S & Docker
CEPH - DRBD, Pacemaker & Corosync, TargetCLi
Bourne Shell - Python - Ansible & Rundeck
RocksCluster - SLURM - Lmod - EasyBuild
Contact