Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 18 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,22 +5,34 @@ The OpenML documentation in written in MarkDown. The sources are generated by [M

The overal structure (navigation) of the docs is configurated in the `mkdocs.yml` file.

Some of the API's use other documentation generators, such as [Sphinx](https://restcoder.readthedocs.io/en/latest/sphinx-docgen.html) in openml-python. This documentation is pulled in via iframes to gather all docs into the same place, but they need to be edited in their own GitHub repo's.
Some of the API's use other documentation generators, such as [Sphinx](https://restcoder.readthedocs.io/en/latest/sphinx-docgen.html) in openml-python. This documentation is pulled in using the [multirepo plugin](https://github.com/jdoiro3/mkdocs-multirepo-plugin) to gather all docs into the same place, but they need to be edited in their own GitHub repo's.

## Editing documentation
Documentation can be edited by simply editing the markdown files in the `docs` folder and creating a pull request.

End users can edit the docs by simply clicking the edit button (the pencil icon) on the top of every documentation page. It will open up an editing page on [GitHub](https://github.com/) (you do need to be logged in on GitHub). When you are done, add a small message explaining the change and click 'commit changes'. On the next page, just launch the pull request. We will then review it and approve the changes, or discuss them if necessary.

## Developing
To build the documentation locally, run `mkdocs serve -f mkdocs-local.yml` in the top directory (with the `mkdocs.yml` file). Any changes made after that will be hot-loaded.

To build the full documentation, including importing the documentation from other repositories, run `mkdocs serve` in the top directory (with the `mkdocs.yml` file). This can take a while to compile, so only use this when needed. You might also need to set `export NUMPY_EXPERIMENTAL_DTYPE_API=1` (or `set NUMPY_EXPERIMENTAL_DTYPE_API=1` on Windows).

## Deployment
The documentation is hosted on GitHub pages.

To deploy the documentation, you need to have MkDocs and MkDocs-Material installed, and then run `mkdocs gh-deploy` in the top directory (with the `mkdocs.yml` file). This will build the HTML files and push them to the gh-pages branch of openml/docs. `https://docs.openml.org` is just a reverse proxy for `https://openml.github.io/docs/`.
To deploy the documentation, you need to have MkDocs installed locally, and then run `mkdocs gh-deploy` in the top directory (with the `mkdocs.yml` file). This will build the HTML files and push them to the gh-pages branch of openml/docs. `https://docs.openml.org` is just a reverse proxy for `https://openml.github.io/docs/`.

MKDocs and MkDocs-Material can be installed as follows:
MkDocs and all required extensions can be installed as follows:
```
pip install mkdocs
pip install mkdocs-material
pip install -U fontawesome_markdown
pip install -r requirements.txt
```

To test the documentation locally, run
```
mkdocs serve
```

To deploy to GitHub Pages, run
```
mkdocs gh-deploy
```
5 changes: 4 additions & 1 deletion docs/contributing/OpenML-Docs.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,10 @@ combined into these documentation pages using [MkDocs multirepo](https://github.
git clone https://github.com/openml/docs.git
pip install -r requirements.txt
```
To build the documentation, run `mkdocs serve` in the top directory (with the `mkdocs.yml` file). Any changes made after that will be hot-loaded.

To build the documentation locally, run `mkdocs serve -f mkdocs-local.yml` in the top directory (with the `mkdocs.yml` file). Any changes made after that will be hot-loaded.

To build the full documentation, including importing the documentation from other repositories, run `mkdocs serve` in the top directory (with the `mkdocs.yml` file). This can take a while to compile, so only use this when needed.

The documentation will be auto-deployed with every push or merge with the master branch of `https://www.github.com/openml/docs/`. In the background, a CI job
will run `mkdocs gh-deploy`, which will build the HTML files and push them to the gh-pages branch of openml/docs. `https://docs.openml.org` is just a reverse proxy for `https://openml.github.io/docs/`.
Expand Down
51 changes: 5 additions & 46 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,56 +15,15 @@ icon: material/creation
<p><i class="fa fa-graduation-cap fa-fw fa-lg"></i>&nbsp; Make your work more visible and reusable</p>
<p><i class="fa fa-bolt fa-fw fa-lg"></i>&nbsp; Built for automation: streamline your experiments and model building</p>

## Installation
## How to use OpenML

The OpenML package is available in many languages and across libraries. For more information about them, see the [Integrations](./ecosystem/index.md) page.<br><br>
OpenML is accessible to a wide range of people:

=== "Python/sklearn"
:computer: <a href="https://www.openml.org" target='blank_'>Explore the OpenML website</a> to discover, download and upload ML resources.

- [Python/sklearn repository](https://github.com/openml/openml-python)
- `pip install openml`
:robot: [Install an OpenML library](intro/index.md) to access and share resources programmatically through our APIs. Select one of the detailed guides in the top menu.

=== "Pytorch"

- [Pytorch repository](https://github.com/openml/openml-pytorch)
- `pip install openml-pytorch`

=== "Keras"

- [Keras repository](https://github.com/openml/openml-keras)
- `pip install openml-keras`

=== "TensorFlow"

- [TensorFlow repository](https://github.com/openml/openml-tensorflow)
- `pip install openml-tensorflow`

=== "R"

- [R repository](https://github.com/openml/openml-R)
- `install.packages("mlr3oml")`
=== "Julia"

- [Julia repository](https://github.com/JuliaAI/OpenML.jl/tree/master)
- `using Pkg;Pkg.add("OpenML")`

=== "RUST"

- [RUST repository](https://github.com/mbillingr/openml-rust)
- Install from source

=== ".Net"

- [.Net repository](https://github.com/openml/openml-dotnet)
- `Install-Package openMl`


You might also need to set up the API key. For more information, see [Authentication](http://localhost:8000/concepts/openness/).

## Learning OpenML

Aside from the individual package documentations, you can learn more about OpenML through the following resources:<br>
The core concepts of OpenML are explained in the [Concepts](./concepts/index.md) page. These concepts include the principle behind using Datasets, Runs, Tasks, Flows, Benchmarking and much more. Going through them will help you leverage OpenML even better in your work.<br>
:mortar_board: [Get started](./concepts/index.md) by learning more about the structure and concepts behind OpenML, such as Datasets, Tasks, Flows, Runs, Benchmarking and much more. This will help you leverage OpenML even better in your work.

## Contributing to OpenML

Expand Down
107 changes: 107 additions & 0 deletions docs/intro/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
icon: material/rocket-launch
---

## :computer: Installation

The OpenML package is available in many languages and has deep integration in many machine learning libraries.

=== "Python/sklearn"

- [Python/sklearn repository](https://github.com/openml/openml-python)
- `pip install openml`

=== "Pytorch"

- [Pytorch repository](https://github.com/openml/openml-pytorch)
- `pip install openml-pytorch`

=== "TensorFlow"

- [TensorFlow repository](https://github.com/openml/openml-tensorflow)
- `pip install openml-tensorflow`

=== "R"

- [R repository](https://github.com/openml/openml-R)
- `install.packages("mlr3oml")`

=== "Julia"

- [Julia repository](https://github.com/JuliaAI/OpenML.jl/tree/master)
- `using Pkg;Pkg.add("OpenML")`

=== "RUST"

- [RUST repository](https://github.com/mbillingr/openml-rust)
- Install from source

=== ".Net"

- [.Net repository](https://github.com/openml/openml-dotnet)
- `Install-Package openMl`

You can find detailed guides for the different libraries in the top menu.


## :key: Authentication

OpenML is entirely open and you do not need an account to access data (rate limits apply). However, <a href="https://www.openml.org" target='blank_'>signing up via the OpenML website</a> is very easy (and free) and required to upload new resources to OpenML and to manage them online.

API authentication happens via an **API key**, which you can find in your profile after logging in to openml.org.

```
openml.config.apikey = "YOUR KEY"
```

## :joystick: Minimal Example

:material-database: Use the following code to load the [credit-g](https://www.openml.org/search?type=data&sort=runs&status=active&id=31) [dataset](https://docs.openml.org/concepts/data/) directly into a pandas dataframe. Note that OpenML can automatically load all datasets, separate data X and labels y, and give you useful dataset metadata (e.g. feature names and which ones have categorical data).

```python
import openml

dataset = openml.datasets.get_dataset("credit-g") # or by ID get_dataset(31)
X, y, categorical_indicator, attribute_names = dataset.get_data(target="class")
```


:trophy: Get a [task](https://docs.openml.org/concepts/tasks/) for [supervised classification on credit-g](https://www.openml.org/search?type=task&id=31&source_data.data_id=31).
Tasks specify how a dataset should be used, e.g. including train and test splits.

```python
task = openml.tasks.get_task(31)
dataset = task.get_dataset()
X, y, categorical_indicator, attribute_names = dataset.get_data(target=task.target_name)
# get splits for the first fold of 10-fold cross-validation
train_indices, test_indices = task.get_train_test_split_indices(fold=0)
```

:bar_chart: Use an [OpenML benchmarking suite](https://docs.openml.org/concepts/benchmarking/) to get a curated list of machine-learning tasks:
```python
suite = openml.study.get_suite("amlb-classification-all") # Get a curated list of tasks for classification
for task_id in suite.tasks:
task = openml.tasks.get_task(task_id)
```

:star2: You can now benchmark your models easily across many datasets at once. A model training is called a run:

```python
from sklearn import neighbors

task = openml.tasks.get_task(403)
clf = neighbors.KNeighborsClassifier(n_neighbors=5)
run = openml.runs.run_model_on_task(clf, task)
```

:raised_hands: You can now publish your experiment on OpenML so that others can build on it:

```python
myrun = run.publish()
print(f"kNN on {data.name}: {myrun.openml_url}")
```


## Learning more OpenML

Next, check out the :rocket: [10 minute tutorial](notebooks/getting_started.ipynb) and the :mortar_board: [short description of OpenML concepts](concepts/index.md).
2 changes: 1 addition & 1 deletion docs/notebooks/getting_started.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Getting Started\n",
"# OpenML in 10 minutes\n",
"\n",
"This page will guide you through the process of getting started with OpenML. While this page is a good starting point, for more detailed information, please refer to the [integrations section](Scikit-learn/index.md) and the rest of the documentation.\n",
"\n"
Expand Down
13 changes: 10 additions & 3 deletions mkdocs-local.yml
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,12 @@ markdown_extensions:
plugins:
- autorefs
- section-index
- mkdocs-jupyter:
ignore: ['temp_dir/**/*','docs/examples/**/*']
theme: light
remove_tag_config:
remove_input_tags:
- hide_code
- redirects:
redirect_maps:
'APIs.md': 'https://www.openml.org/apis'
Expand All @@ -98,9 +104,10 @@ plugins:
- git-committers:
repository: openml/docs
nav:
- OpenML:
- Introduction: index.md
- Getting Started: notebooks/getting_started.ipynb
- OpenML: index.md
- Get Started:
- OpenML: intro/index.md
- 10 Minute Tutorial: notebooks/getting_started.ipynb
- Concepts:
- Main concepts: concepts/index.md
- Data: concepts/data.md
Expand Down
10 changes: 7 additions & 3 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,8 @@ plugins:
docstring_section_style: table
show_docstring_functions: true
docstring_style: numpy
follow_imports: false
show_submodules: false
- gen-files:
scripts:
- scripts/gen_python_ref_pages.py
Expand All @@ -131,9 +133,10 @@ plugins:
- git-committers:
repository: openml/docs
nav:
- OpenML:
- Introduction: index.md
- Getting Started: notebooks/getting_started.ipynb
- OpenML: index.md
- Get Started:
- OpenML: intro/index.md
- 10 Minute Tutorial: notebooks/getting_started.ipynb
- Concepts:
- Main concepts: concepts/index.md
- Data: concepts/data.md
Expand Down Expand Up @@ -213,6 +216,7 @@ extra_css:
- css/extra.css
extra_javascript:
- js/extra.js
- js/reset_nav.js
exclude_docs: |
scripts/
old/
Expand Down
21 changes: 11 additions & 10 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,15 @@ mkdocs-redirects==1.2.1
mkdocs-jupyter==0.25.0
mkdocs-awesome-pages-plugin==2.9.3
mkdocs-multirepo-plugin==0.8.3
mkdocs-autorefs
mkdocs-section-index
mkdocs-gen-files
mkdocs-literate-nav
mkdocs-git-committers-plugin-2
mkdocs-git-revision-date-localized-plugin
mkdocstrings
mkdocstrings-python
markdown-include
mkdocs-autorefs==1.2.0
mkdocs-section-index==0.3.9
mkdocs-gen-files==0.5.0
mkdocs-literate-nav==0.6.1
mkdocs-git-committers-plugin-2==2.5.0
mkdocs-git-revision-date-localized-plugin==1.3.0
mkdocstrings==0.26.2
mkdocstrings-python==1.12.1
markdown-include==0.8.1
notebook==6.4.12
tqdm
jupyter_contrib_nbextensions==0.7.0
tqdm
63 changes: 31 additions & 32 deletions scripts/gen_python_ref_pages.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,43 +12,42 @@
import os
import shutil

# Move the python code and example folders into the root folder. This is necessary because the literate-nav has very strong
# opinions on where the files should be located. It refuses to work from the temp_dir directory.
def copy_folders_to_destinations(source_folders:list[str], destination_folders:list[str]):
"""
Copies folders from source to specified destinations and overwrites if they already exist.

Parameters:
- source_folders (list of str): List of paths to the source folders.
- destination_folders (list of str): List of full paths to the target directories, including the new folder names.
"""
if len(source_folders) != len(destination_folders):
return
# Clean a folder completely
def clean_folder(folder: Path):
if folder.exists() and folder.is_dir():
shutil.rmtree(folder)

# Copy each folder to its specified destination
for src, dest in zip(source_folders, destination_folders):
# Ensure the parent directory of the destination path exists
os.makedirs(os.path.dirname(dest), exist_ok=True)

# Remove the folder if it already exists
if os.path.exists(dest):
shutil.rmtree(dest)

# Copy the folder
shutil.copytree(src, dest)

temp_dir = Path(__file__).parent.parent / "temp_dir" / "python"
root = Path(__file__).parent.parent
temp_dir = root / "temp_dir" / "python"

# Destination folders
destination_folders = [
root / "docs" / "python",
root / "docs" / "examples",
root / "openml",
]

# Clean all destination folders
for folder in destination_folders:
clean_folder(folder)

# Source folders
source_folders = [
temp_dir / "docs",
temp_dir / "openml",
temp_dir / "examples",
temp_dir / "openml",
]
destination_folders = [
Path(__file__).parent.parent / "docs" / "python",
Path(__file__).parent.parent / "openml",
Path(__file__).parent.parent / "docs" / "examples" # Move them straight here to avoid duplication. mkdocs-jupyter will handle them.
]
copy_folders_to_destinations(source_folders, destination_folders)

# Copy source to destination
def copy_folders(source_folders: list[Path], destination_folders: list[Path]):
if len(source_folders) != len(destination_folders):
raise ValueError("Source and destination lists must have the same length.")

for src, dest in zip(source_folders, destination_folders):
if src.exists():
shutil.copytree(src, dest)

copy_folders(source_folders, destination_folders)

# Generate the reference page docs
nav = mkdocs_gen_files.Nav()
Expand Down