Previous Job
Previous
Data Scientist - 12 month contract
Ref No.: 18-00241
Location: COLUMBIA, South Carolina
Essential Responsibilities

1. Work alongside appropriate staff, teams, stakeholders and other points of contact (POCs) as required, to understand the goals and objectives of complex information systems
2. Enhance data collection procedures to include information that is relevant for building analytic systems
3. Select features, build and optimize classifiers using machine learning techniques
4. Process, cleanse and verify the integrity of data used for analysis
5. Data-mine using state of the art methods
6. Create automated anomaly detection systems
Program Experience:

Experience must include well documented success in applying statistical skills such as distributions, statistical testing, regression, etc.

Experience with conducting ad-hoc and regularly scheduled analysis and presenting results in a clear manner is ideal.

Experience with ontologies, semantic web modeling, and data modeling would be considered desirable for this position.

Technical Experience:

Experience with any or all the following technologies is desirable for this position:

• Medicaid Management Information Systems (or other Health Information Technologies)
• Data science and visualization using technology such as R, RStudio, QlikView, and Tableau
• AI/ Neural Nets
• UML and architectural modeling using tools such as Rational and SPARX
• Big Data and NoSQL technologies such as MongoDB, Marklogic, Cassandra, and Hadoop
• Fluency in a scripting language such as Python or R-
• The ideal candidate has experience in all of the following product categories with at least one of the corresponding vendor technologies:
Product Category Vendor Technology
Data Science & Viz R, RStudio, SPSS Modeler, SPSS, QlikView, MDX, Tableau, Anaconda Spyder, SAS, BIRT, SSAS, SAS EM
AI/ Neural Nets TensorFlow, Keras, Word2vec, Doc2vec, CNNs, ANNs, LSTM, RNNs, GANs, Theano, Torch, Bidirectional LSTM
Ontology, Semantic Web
Modeling Magic Draw Visual Ontology Modeler, Smartlogic, RDF, RDFA, Turtle, SKOS, OWL, OWL2, SPARQL, Linked Data,
Neo4j, Open World Lexicography Assumptions, Ontology Frameworks, Revelytix, Protege
Architectural Modeling Magic Draw, Mega, Troux, IBM Rational, Sparx
UML Modeling MagicDraw Zachman and TOGAF, RUP, ArgoUML, RSA
Glossary, Models BG, ACORD Framework, Automotive All Divisions; Healthcare; Utility; Gas & Oil Process; GRC; Enterprise, Universal
Big Data, NoSQL Hortonworks, Cloudera, Client, Sqrrl, Cassandra, MarkLogic, Couchbase, Cloudant, Alpine Labs, DataStax, MongoDB
Pivotal Platform HAWQ, Gemfire, Spring XD, MADlib, PivotalR, Greenplum, PostgreSQL, PL/R, Pythonu, plpy
Data Modeling IBM IDA, UML, ERwin, Sandhill, Rational Data Architect, Star Schema, Snowflake, Power-Designor, Navigator,
James Martin, IEF, IEW, 3rd Normal Form, ER/Studio, SA, RDA, ADRM, Big Data Modeling Techniques
ETL, ELT, ETML, EAI Sqoop, pig, Composite, Information Server, Custom, Informatica, ETI, Data Stage EE, SAS ETL, SSIS 2008; Talend
Data Profiling/Quality Exeros/ CA ERwin Data Profiler (now IBM Optim), Evoke AXIO (now Informatica Data Explorer), SAS, Profile Stage,
Information Analyzer, Talend Open Profiler; BODS Data Profiler, EIM, Information Steward; Trillium
Linguistic Algorithms R tm, NLTK, Gensim, SpaCy, Sense2vec, Triplets, Linguistics Analysis Services, NLP, CL, Collocation Analysis,
Generative Patterns, Dependency Grammars, SLING, DRAGNN, SyntaxNet, sonnet
Metadata MITI MIMB,Unicorn, IBM Metadata Workbench, MetaStage, Ron Ross, Platinum Repository, Global IDS
Database Pivotal Hawq, Hortonworks, Kudu, z/OS DB2, UDB DB2, SQL Server, Netezza, Oracle, MySQL, PostgreSQL
Graph Databases/Layers AllegroGraph, GraphLab/Dato, Giraph, Graphx, Neo4j
EDW/DM Methodologies Kimball Conformed Dimensions, Chris Adamson, Inmon CIF
Client, Data Mining R tm, NLTK, SAS EM, IBM Intelligent Miner, SQL Server Data Miner, Predixion
Programming Languages C, C++, C#, Python, SQL, Scala, Julia, J2EE, Perl, Bash, JMS, Ruby on Rails, Clojure, JavaScript


General Duties and Responsibilities:

1. Research and develop statistical learning models for data analysis
2. Implement new methodologies as needed for specific models or analysis
3. Conduct data collection, preprocessing and analysis
4. Collaborate with agency leadership, business partners and other parties/stakeholders to understand agency needs and provide recommendations and possible solutions

REQUIRED SKILLS (RANK IN ORDER OF IMPORTANCE):

1. 5+ years practical experience with data processing, data visualization and data analytics
2. 5+ years of experience coordinating complex data architecture to align with business needs
3. 5+ years quantitative analysis experience
4. 5+ years debugging experience

PREFERRED SKILLS (RANK IN ORDER OF IMPORTANCE):

1. Prior experience in working with query languages, probability tools, data analytics tools and business intelligence tools
2. Prior Health Information Technology and/or Program experience
3. Prior experience with South Carolina Medicaid, Social Services, or similar public benefit programs

REQUIRED EDUCATION/CERTIFICATIONS:

1. College Degree or equivalent work experience required. Preference will be given to, in no particular order:
a. BS degree in Computer Science, Applied Math or similar discipline.