Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 33 additions & 6 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,34 @@ on:
- 'setup.cfg'
- '.github/workflows/**'
pull_request:
paths:
- 'deepctr/**'
- 'tests/**'
- 'setup.py'
- 'setup.cfg'
- '.github/workflows/**'
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.ref_name }}
cancel-in-progress: true

jobs:
changes:
runs-on: ubuntu-22.04
outputs:
code_related: ${{ steps.filter.outputs.code_related }}
steps:
- uses: actions/checkout@v5

- name: Detect code-related changes
id: filter
uses: dorny/paths-filter@v3
with:
filters: |
code_related:
- 'deepctr/**'
- 'tests/**'
- 'setup.py'
- 'setup.cfg'
- '.github/workflows/**'

build:
needs: changes
runs-on: ubuntu-22.04
timeout-minutes: 240
strategy:
Expand All @@ -29,14 +47,21 @@ jobs:
tf-version: "1.15.5"

steps:
- name: Docs-only PR fast-path
if: ${{ github.event_name == 'pull_request' && needs.changes.outputs.code_related != 'true' }}
run: echo "Docs-only PR detected; skipping heavy CI steps."

- uses: actions/checkout@v5
if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.code_related == 'true' }}

- name: Setup python environment
if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.code_related == 'true' }}
uses: actions/setup-python@v6
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.code_related == 'true' }}
run: |
python -m pip install -q --upgrade pip setuptools wheel
python -m pip install -q "numpy<2"
Expand All @@ -47,11 +72,13 @@ jobs:
python -m pip check

- name: Test with pytest
if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.code_related == 'true' }}
timeout-minutes: 240
run: |
pytest --cov=deepctr --cov-report=xml --cov-report=term-missing:skip-covered

- name: Upload coverage to Codecov
if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.code_related == 'true' }}
uses: codecov/codecov-action@v6
with:
token: ${{ secrets.CODECOV_TOKEN }}
Expand Down
39 changes: 33 additions & 6 deletions .github/workflows/ci2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,34 @@ on:
- 'setup.cfg'
- '.github/workflows/**'
pull_request:
paths:
- 'deepctr/**'
- 'tests/**'
- 'setup.py'
- 'setup.cfg'
- '.github/workflows/**'
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.ref_name }}
cancel-in-progress: true

jobs:
changes:
runs-on: ubuntu-22.04
outputs:
code_related: ${{ steps.filter.outputs.code_related }}
steps:
- uses: actions/checkout@v5

- name: Detect code-related changes
id: filter
uses: dorny/paths-filter@v3
with:
filters: |
code_related:
- 'deepctr/**'
- 'tests/**'
- 'setup.py'
- 'setup.cfg'
- '.github/workflows/**'

build:
needs: changes
runs-on: ubuntu-22.04
timeout-minutes: 180
strategy:
Expand Down Expand Up @@ -47,14 +65,21 @@ jobs:
TF_USE_LEGACY_KERAS: ${{ matrix.use-legacy-keras }}

steps:
- name: Docs-only PR fast-path
if: ${{ github.event_name == 'pull_request' && needs.changes.outputs.code_related != 'true' }}
run: echo "Docs-only PR detected; skipping heavy CI steps."

- uses: actions/checkout@v5
if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.code_related == 'true' }}

- name: Setup python environment
if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.code_related == 'true' }}
uses: actions/setup-python@v6
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.code_related == 'true' }}
run: |
python -m pip install -q --upgrade pip setuptools wheel
python -m pip install -q "numpy<2"
Expand All @@ -67,11 +92,13 @@ jobs:
python -m pip check

- name: Test with pytest
if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.code_related == 'true' }}
timeout-minutes: 180
run: |
pytest --cov=deepctr --cov-report=xml --cov-report=term-missing:skip-covered

- name: Upload coverage to Codecov
if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.code_related == 'true' }}
uses: codecov/codecov-action@v6
with:
token: ${{ secrets.CODECOV_TOKEN }}
Expand Down
12 changes: 10 additions & 2 deletions .readthedocs.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
version: 2

build:
image: latest
os: ubuntu-22.04
tools:
python: "3.10"

sphinx:
configuration: docs/source/conf.py

python:
version: 3.6
install:
- requirements: docs/requirements.readthedocs.txt
15 changes: 14 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,20 @@ core components layers which can be used to easily build custom models.You can u

- Provide `tf.keras.Model` like interfaces for **quick experiment**. [example](https://deepctr-doc.readthedocs.io/en/latest/Quick-Start.html#getting-started-4-steps-to-deepctr)
- Provide `tensorflow estimator` interface for **large scale data** and **distributed training**. [example](https://deepctr-doc.readthedocs.io/en/latest/Quick-Start.html#getting-started-4-steps-to-deepctr-estimator-with-tfrecord)
- It is compatible with both `tf 1.x` and `tf 2.x`.
- It is compatible with both `tf 1.15` and `tf 2.x`.

## Installation and compatibility

DeepCTR does not pin or install TensorFlow for you. Install a TensorFlow build that matches your Python, NumPy, CPU/GPU, and operating system first, then install DeepCTR:

```bash
pip install tensorflow
pip install deepctr
```

For Python `>=3.9`, DeepCTR allows modern `h5py` releases with `h5py>=3.7.0`. If TensorFlow reports a NumPy conflict, follow the TensorFlow requirement for your selected TensorFlow release, for example using `numpy<2` when required by TensorFlow.

Use public `tensorflow.keras` APIs in your own code and examples. Avoid mixing `tensorflow.python.keras` with `tensorflow.keras`, because `tensorflow.python.*` is private TensorFlow API and can break model serialization or optimizer/metric loading across TensorFlow versions.

Some related projects:

Expand Down
14 changes: 12 additions & 2 deletions docs/requirements.readthedocs.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,12 @@
tensorflow==2.6.2
recommonmark==0.7.1
numpy<2
Jinja2<3.1
docutils<0.18
sphinx==4.5.0
sphinx-rtd-theme==0.5.2
sphinxcontrib-applehelp==1.0.2
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-htmlhelp==2.0.0
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-serializinghtml==1.1.5
recommonmark==0.7.1
tensorflow==2.15.0
15 changes: 7 additions & 8 deletions docs/source/Examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -206,7 +206,7 @@ following codes.
import numpy as np
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from tensorflow.python.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.preprocessing.sequence import pad_sequences

from deepctr.models import DeepFM
from deepctr.feature_column import SparseFeat, VarLenSparseFeat, get_feature_names
Expand Down Expand Up @@ -278,7 +278,7 @@ if __name__ == "__main__":
```python
import numpy as np
import pandas as pd
from tensorflow.python.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.preprocessing.sequence import pad_sequences

from deepctr.feature_column import SparseFeat, VarLenSparseFeat, get_feature_names
from deepctr.models import DeepFM
Expand Down Expand Up @@ -334,7 +334,7 @@ from deepctr.models import DeepFM
from deepctr.feature_column import SparseFeat, VarLenSparseFeat, get_feature_names
import numpy as np
import pandas as pd
from tensorflow.python.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.preprocessing.sequence import pad_sequences

try:
import tensorflow.compat.v1 as tf
Expand Down Expand Up @@ -401,7 +401,6 @@ and run the following codes.
```python
import tensorflow as tf

from tensorflow.python.ops.parsing_ops import FixedLenFeature
from deepctr.estimator import DeepFMEstimator
from deepctr.estimator.inputs import input_fn_tfrecord

Expand All @@ -425,10 +424,10 @@ if __name__ == "__main__":

# 2.generate input data for model

feature_description = {k: FixedLenFeature(dtype=tf.int64, shape=1) for k in sparse_features}
feature_description = {k: tf.io.FixedLenFeature(dtype=tf.int64, shape=1) for k in sparse_features}
feature_description.update(
{k: FixedLenFeature(dtype=tf.float32, shape=1) for k in dense_features})
feature_description['label'] = FixedLenFeature(dtype=tf.float32, shape=1)
{k: tf.io.FixedLenFeature(dtype=tf.float32, shape=1) for k in dense_features})
feature_description['label'] = tf.io.FixedLenFeature(dtype=tf.float32, shape=1)

train_model_input = input_fn_tfrecord('./criteo_sample.tr.tfrecords', feature_description, 'label', batch_size=256,
num_epochs=1, shuffle_factor=10)
Expand Down Expand Up @@ -597,4 +596,4 @@ if __name__ == "__main__":
print("test marital AUC", round(roc_auc_score(test['label_marital'], pred_ans[1]), 4))


```
```
33 changes: 25 additions & 8 deletions docs/source/FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ model.load_weights('DeepFM_w.h5')
To save/load models,just a little different.

```python
from tensorflow.python.keras.models import save_model,load_model
from tensorflow.keras.models import save_model, load_model
model = DeepFM()
save_model(model, 'DeepFM.h5')# save_model, same as before

Expand All @@ -27,8 +27,8 @@ Here is a example of how to set learning rate and earlystopping:

```python
import deepctr
from tensorflow.python.keras.optimizers import Adam,Adagrad
from tensorflow.python.keras.callbacks import EarlyStopping
from tensorflow.keras.optimizers import Adam, Adagrad
from tensorflow.keras.callbacks import EarlyStopping

model = deepctr.models.DeepFM(linear_feature_columns,dnn_feature_columns)
model.compile(Adagrad(0.1024),'binary_crossentropy',metrics=['binary_crossentropy'])
Expand Down Expand Up @@ -61,8 +61,8 @@ import itertools
import deepctr
from deepctr.models import AFM
from deepctr.feature_column import get_feature_names
from tensorflow.python.keras.models import Model
from tensorflow.python.keras.layers import Lambda
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Lambda

model = AFM(linear_feature_columns,dnn_feature_columns)
model.fit(model_input,target)
Expand Down Expand Up @@ -145,10 +145,27 @@ model.fit(model_input,label)
```

## 7. How to run the demo with GPU ?
just install deepctr with
Install the TensorFlow build recommended for your CUDA, cuDNN, and platform combination, then install `deepctr`.

## 8. How to avoid TensorFlow, Keras, h5py, or NumPy compatibility errors?

Install TensorFlow separately before installing DeepCTR. Pick the TensorFlow release according to your Python version, CPU/GPU environment, and platform.

```bash
$ pip install deepctr[gpu]
$ pip install tensorflow
$ pip install deepctr
```

## 8. How to run the demo with multiple GPUs
For Python `>=3.9`, DeepCTR uses `h5py>=3.7.0`, so newer `h5py` releases such as `3.9+` and `3.12+` are allowed. If TensorFlow reports a NumPy conflict, follow the TensorFlow requirement for the TensorFlow release you installed, for example using `numpy<2` when required by TensorFlow.

Use public `tensorflow.keras` imports in your own code:

```python
from tensorflow.keras.models import load_model
from tensorflow.keras.optimizers import Adam
```

Avoid mixing `tensorflow.python.keras` with `tensorflow.keras`. `tensorflow.python.*` is private TensorFlow API and can break serialization, optimizer loading, or metric loading across TensorFlow versions.

## 9. How to run the demo with multiple GPUs
you can use multiple gpus with tensorflow version higher than ``1.4``,see [run_classification_criteo_multi_gpu.py](https://github.com/shenweichen/DeepCTR/blob/master/examples/run_classification_criteo_multi_gpu.py)
28 changes: 18 additions & 10 deletions docs/source/Quick-Start.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,27 @@
# Quick-Start
[![](https://pai-public-data.oss-cn-beijing.aliyuncs.com/EN-pai-dsw.svg)](https://dsw-dev.data.aliyun.com/#/?fileUrl=https://pai-public-data.oss-cn-beijing.aliyuncs.com/deep-ctr/Getting-started-4-steps-to-DeepCTR.ipynb&fileName=Getting-started-4-steps-to-DeepCTR.ipynb)
## Installation Guide
Now `deepctr` is available for python `2.7 `and `3.5, 3.6, 3.7`.
`deepctr` depends on tensorflow, you can specify to install the cpu version or gpu version through `pip`.
Now `deepctr` supports Python `>=3.7` and is tested with TensorFlow `1.15` and TensorFlow `2.x`.

### CPU version
DeepCTR does not pin or install TensorFlow for you. Install a TensorFlow build that matches your Python, NumPy, CPU/GPU, and operating system first, then install DeepCTR:

```bash
$ pip install deepctr[cpu]
$ pip install tensorflow
$ pip install deepctr
```
### GPU version

For GPU environments, install the TensorFlow package recommended for your CUDA, cuDNN, and platform combination, then install `deepctr`.

For Python `>=3.9`, DeepCTR allows modern `h5py` releases with `h5py>=3.7.0`. If TensorFlow reports a NumPy conflict, follow the TensorFlow requirement for your selected TensorFlow release, for example using `numpy<2` when required by TensorFlow.

Use public `tensorflow.keras` APIs in your own code. Avoid mixing `tensorflow.python.keras` with `tensorflow.keras`, because `tensorflow.python.*` is private TensorFlow API and can break model serialization or optimizer/metric loading across TensorFlow versions.

### Install from source

```bash
$ pip install deepctr[gpu]
$ git clone https://github.com/shenweichen/DeepCTR.git
$ cd DeepCTR
$ pip install .
```
## Getting started: 4 steps to DeepCTR

Expand Down Expand Up @@ -128,7 +137,6 @@ You also can run a distributed training job with the keras model on Kubernetes u
```python
import tensorflow as tf

from tensorflow.python.ops.parsing_ops import FixedLenFeature
from deepctr.estimator.inputs import input_fn_tfrecord
from deepctr.estimator.models import DeepFMEstimator

Expand All @@ -155,10 +163,10 @@ for feat in dense_features:
### Step 3: Generate the training samples with TFRecord format

```python
feature_description = {k: FixedLenFeature(dtype=tf.int64, shape=1) for k in sparse_features}
feature_description = {k: tf.io.FixedLenFeature(dtype=tf.int64, shape=1) for k in sparse_features}
feature_description.update(
{k: FixedLenFeature(dtype=tf.float32, shape=1) for k in dense_features})
feature_description['label'] = FixedLenFeature(dtype=tf.float32, shape=1)
{k: tf.io.FixedLenFeature(dtype=tf.float32, shape=1) for k in dense_features})
feature_description['label'] = tf.io.FixedLenFeature(dtype=tf.float32, shape=1)

train_model_input = input_fn_tfrecord('./criteo_sample.tr.tfrecords', feature_description, 'label', batch_size=256,
num_epochs=1, shuffle_factor=10)
Expand Down
Loading
Loading