diff --git a/README.md b/README.md index 4b216167..57125686 100644 --- a/README.md +++ b/README.md @@ -1,106 +1,99 @@ # Apache DataSketches Core C++ Library Component -This is the core C++ component of the Apache DataSketches library. It contains all of the key sketching algorithms that are in the Java component and can be accessed directly from user applications. -This component is also a dependency of other components of the library that create adaptors for target systems, such as PostgreSQL. +This is the core C++ component of the Apache DataSketches library. It contains all the key sketching algorithms from the Java implementation and can be accessed directly by user applications. -Note that we have a parallel core component for [Java]((https://github.com/apache/datasketches-java) and [Python]((https://github.com/apache/datasketches-python) implementations of the same sketch algorithms. +This component is also a dependency of other library components that create adaptors for target systems, such as [PostgreSQL](https://github.com/apache/datasketches-postgresql). + +Note that we have parallel core library components for Java, Python, and GO implementations of many of the same sketch algorithms: + +- [datasketches-java](https://github.com/apache/datasketches-java) +- [datasketches-python](https://github.com/apache/datasketches-python) +- [datasketches-go](https://github.com/apache/datasketches-go) Please visit the main [Apache DataSketches website](https://datasketches.apache.org) for more information. -If you are interested in making contributions to this site please see our [Community](https://datasketches.apache.org/docs/Community/) page for how to contact us. +If you are interested in making contributions to this site, please see our [Community](https://datasketches.apache.org/docs/Community/) page for how to contact us. --- This code requires C++11. -This library is header-only. The build process provided is only for building unit tests. +This library is header-only. The provided build process is only for unit tests. -Building the unit tests requires cmake 3.12.0 or higher. +Building the unit tests requires CMake 3.12.0 or higher. -Installing the latest cmake on OSX: brew install cmake +Installing the latest CMake on OSX: `brew install cmake`. -Building and running unit tests using cmake for OSX and Linux: +Building and running unit tests using CMake for OSX and Linux: -``` - $ cmake -S . -B build/Release -DCMAKE_BUILD_TYPE=Release - $ cmake --build build/Release -t all test +```shell +cmake -S . -B build/Release -DCMAKE_BUILD_TYPE=Release +cmake --build build/Release -t all test ``` -Building and running unit tests using cmake for Windows from the command line: +Building and running unit tests using CMake for Windows from the command line: -``` - $ cd build - $ cmake .. - $ cd .. - $ cmake --build build --config Release - $ cmake --build build --config Release --target RUN_TESTS +```shell +cd build +cmake .. +cd .. +cmake --build build --config Release +cmake --build build --config Release --target RUN_TESTS ``` -To install a local distribution (OSX and Linux), use the following command. The -CMAKE_INSTALL_PREFIX variable controls the destination. If not specified, it -defaults to installing in /usr (/usr/include, /usr/lib, etc). In the command below, -the installation will be in /tmp/install/DataSketches (/tmp/install/DataSketches/include, -/tmp/install/DataSketches/lib, etc) +To install a local distribution (OSX and Linux), use the following command. The `CMAKE_INSTALL_PREFIX` variable controls the destination. If not specified, it defaults to installing in /usr (/usr/include, /usr/lib, etc). In the command below, the installation will be in /tmp/install/DataSketches (/tmp/install/DataSketches/include, /tmp/install/DataSketches/lib, etc). -``` - $ cmake -S . -B build/Release -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/tmp/install/DataSketches - $ cmake --build build/Release -t install +```shell +cmake -S . -B build/Release -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/tmp/install/DataSketches +cmake --build build/Release -t install ``` -To generate an installable package using cmake's built in cpack packaging tool, -use the following command. The type of packaging is controlled by the CPACK_GENERATOR -variable (semi-colon separated list). Cmake usually supports packaging types such as RPM, -DEB, STGZ, TGZ, TZ, ZIP, etc. +To generate an installable package using CMake's built-in cpack packaging tool, use the following command. The type of packaging is controlled by the `CPACK_GENERATOR` variable (semi-colon separated list). CMake usually supports packaging formats such as RPM, DEB, STGZ, TGZ, TZ, and ZIP. -``` - $ cmake3 -S . -B build/Release -DCMAKE_BUILD_TYPE=Release -DCPACK_GENERATOR="RPM;STGZ;TGZ" - $ cmake3 --build build/Release -t package +```shell +cmake -S . -B build/Release -DCMAKE_BUILD_TYPE=Release -DCPACK_GENERATOR="RPM;STGZ;TGZ" +cmake --build build/Release -t package ``` The DataSketches project can be included in other projects' CMakeLists.txt files in one of two ways. -If DataSketches has been installed on the host (using an RPM, DEB, "make install" into /usr/local, or some -way, then CMake's `find_package` command can be used like this: -``` - find_package(DataSketches 3.2 REQUIRED) - target_link_library(my_dependent_target PUBLIC ${DATASKETCHES_LIB}) +If DataSketches has been installed on the host (using an RPM, DEB, "make install" into /usr/local, or some way, then CMake's `find_package` command can be used like this: + +```cmake +find_package(DataSketches 3.2 REQUIRED) +target_link_library(my_dependent_target PUBLIC ${DATASKETCHES_LIB}) ``` When used with find_package, DataSketches exports several variables, including - - `DATASKETCHES_VERSION`: The version number of the datasketches package that was imported. - - `DATASKETCHES_INCLUDE_DIR`: The directory that should be added to access DataSketches include files. - Because cmake automatically includes the interface directories for included target libraries when - using `target_link_library`, under normal circumstances there will be no need to include this directly. - - `DATASKETCHES_LIB`: The name of the DataSketches target to include as a dependency. Projects pulling - in DataSketches should reference this with `target_link_library` in order to set up all the correct dependencies - and include paths. - -If you don't have DataSketches installed locally, dependent projects can pull it directly -from GitHub using CMake's `ExternalProject` module. The code would look something like this: - -``` - cmake_policy(SET CMP0097 NEW) - include(ExternalProject) - ExternalProject_Add(datasketches - GIT_REPOSITORY https://github.com/apache/datasketches-cpp.git - GIT_TAG 3.2.0 - GIT_SHALLOW true - GIT_SUBMODULES "" - INSTALL_DIR /tmp/datasketches-prefix - CMAKE_ARGS -DBUILD_TESTS=OFF -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE} -DCMAKE_INSTALL_PREFIX=/tmp/datasketches-prefix - - # Override the install command to add DESTDIR - # This is necessary to work around an oddity in the RPM (but not other) package - # generation, as CMake otherwise picks up the Datasketch files when building - # an RPM for a dependent package. (RPM scans the directory for files in addition to installing - # those files referenced in an "install" rule in the cmake file) - INSTALL_COMMAND env DESTDIR= ${CMAKE_COMMAND} --build . --target install - ) - ExternalProject_Get_property(datasketches INSTALL_DIR) - set(datasketches_INSTALL_DIR ${INSTALL_DIR}) - message("Source dir of datasketches = ${datasketches_INSTALL_DIR}") - target_include_directories(my_dependent_target - PRIVATE ${datasketches_INSTALL_DIR}/include/DataSketches) - add_dependencies(my_dependent_target datasketches) +- `DATASKETCHES_VERSION`: The version number of the datasketches package that was imported. +- `DATASKETCHES_INCLUDE_DIR`: The directory that should be added to access DataSketches include files. Because CMake automatically includes the interface directories for included target libraries when using `target_link_library`, under normal circumstances, there will be no need to include this directly +- `DATASKETCHES_LIB`: The name of the DataSketches target to include as a dependency. Projects pulling in DataSketches should reference this with `target_link_library` in order to set up all the correct dependencies and include paths. + +If you don't have DataSketches installed locally, dependent projects can pull it directly from GitHub using CMake's `ExternalProject` module. The code would look something like this: + +```cmake +cmake_policy(SET CMP0097 NEW) +include(ExternalProject) +ExternalProject_Add(datasketches + GIT_REPOSITORY https://github.com/apache/datasketches-cpp.git + GIT_TAG 3.2.0 + GIT_SHALLOW true + GIT_SUBMODULES "" + INSTALL_DIR /tmp/datasketches-prefix + CMAKE_ARGS -DBUILD_TESTS=OFF -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE} -DCMAKE_INSTALL_PREFIX=/tmp/datasketches-prefix + + # Override the install command to add DESTDIR + # This is necessary to work around an oddity in the RPM (but not other) package + # generation, as CMake otherwise picks up the Datasketch files when building + # an RPM for a dependent package. (RPM scans the directory for files in addition to installing + # those files referenced in an "install" rule in the cmake file) + INSTALL_COMMAND env DESTDIR= ${CMAKE_COMMAND} --build . --target install +) +ExternalProject_Get_property(datasketches INSTALL_DIR) +set(datasketches_INSTALL_DIR ${INSTALL_DIR}) +message("Source dir of datasketches = ${datasketches_INSTALL_DIR}") +target_include_directories(my_dependent_target + PRIVATE ${datasketches_INSTALL_DIR}/include/DataSketches) +add_dependencies(my_dependent_target datasketches) ```