Obtenez par e-mail toute l'actualité Hortonworks

Une fois par mois, recevez les dernières idées, tendances, informations d’analyse et découvertes sur le Big Data.


Sign up for the Developers Newsletter

Une fois par mois, recevez les dernières idées, tendances, informations d’analyse et découvertes sur le Big Data.




Prêt à débuter ?

Télécharger Sandbox

Que pouvons-nous faire pour vous ?

* Je comprends que je peux me désabonner à tout moment. J'ai également compris les informations supplémentaires fournies dans la Politique de confidentialité de Hortonworks.
fermerBouton Fermer

Data Science @ Scale

HDP and IBM Data Science Experience

Improve the success of your data science initiatives

Téléchargez le livre blanc

La science des données change la donne pour les entreprises

La science des données est un domaine interdisciplinaire qui combine l'apprentissage automatique, les statistiques, l’analyse avancée et la programmation. Cette nouvelle discipline produit des renseignements précieux et met les données au travail dans l'ère cognitive.

IBM Data Science Experience (DSX) is an enterprise platform for data scientists and data engineers. It offers out-of-the-box open-source and commercial data science tools including RStudio, Apache Spark, Jupyter, and Zeppelin notebooks. DSX supports the entire data science lifecycle from data preparation and ETL to model development and deployment. With DSX, companies can build predictive and machine learning models using their favorite tools, technologies, and libraries, while leveraging the scale, security and governance of the HDP platform.

manufacturing video imgbouton de la vidéo

Cycle de vie des données


Access to community

DSX provides a social environment where data scientists can research and share articles, data sets, notebooks, and tutorials. DSX enables data scientists and analysts to come up to speed by taking courses in R, Python, or Scala, copy content into a Jupyter or a Zeppelin notebook, or work in an embedded RStudio environment.

  • Find tutorials and datasets
  • Connect with data scientists and ask questions
  • Research articles and papers
  • Fork and share projects
Blog: Certification of IBM Data Science Experience (DSX) on HDP is a Win-Win for Customers
Use familiar open source tools and libraries

With DSX, data scientists have the flexibility to create new Jupyter or Zeppelin notebooks in R, Python, or Scala or import an existing notebook. DSX includes popular open source libraries, such as PySpark, matplotlib, SparkML and machine learning and deep learning APIs. Data scientists can use DSX to tell a compelling story with the help of open source visualization libraries like Brunel and PixieDust and have the flexibility to install other open source libraries of their choice.

  • Code in Scala, Python, R, Apache Spark and SQL
  • Visualize and share code using Zeppelin & Jupyter Notebooks
  • Leverage RStudio IDE and Shiny
  • Use your favorite libraries including Scikit-learn, XGBoost, Spark Mlib, TensorFlow, Caffe, Keras and MXNet
Webinar: From Data Science to Enterprise Data Science @ Scale
Operationalize models with one click

With DSX, administrators can deploy models with one-click and have the ability to monitor all runtime environments and services.

  • Data Shaping Pipeline UI
  • Auto-data preparation & modeling
  • Advanced Visualizations
  • Model management & deployment
  • Documented Model APIs
Solution Brief: Data Science Machine Learning
Scale and enterprise security

The combination of HDP and DSX empowers enterprises to run data science at scale by leveraging all the data in the data lake, as well as deploying enterprise-grade security, governance, and operations.

  • Data Science at Scale - Run Spark Jobs on HDP Cluster
  • Secure Hadoop Support using Apache Ranger
  • Support for ABAC using Apache Ranger
Blog: An Exciting Data Science Experience on HDP