-
Notifications
You must be signed in to change notification settings - Fork 99
MINIFICPP-2342 Do not overwrite config files during an upgrade #2069
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
MINIFICPP-2342 Do not overwrite config files during an upgrade #2069
Conversation
ebca3b0 to
604baba
Compare
| #include "utils/Environment.h" | ||
|
|
||
| namespace { | ||
| bool fileContentsMatch(const std::filesystem::path& file_name, const std::unordered_set<std::string>& expected_contents) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of this function is a bit misleading, it could be something line propertiesMatchInFile to indicate that this is not a generic file content match check. Also this might be a bit shorter like this:
for (std::string line; std::getline(file, line); ) {
actual_contents.insert(std::move(line));
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have renamed the function to settingsInFileAreAsExpected() and changed the while loop to a for loop in fb3cdef
I don't think the move would be correct here, since we would reuse line after we moved from it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(thread reply) I think move is valid: the moved-from state is messy in general, some people say invalid, the standard says unspecified but valid state. But if it's reassigned (by getfile), then it probably doesn't matter. I would be a proponent of a destructive move / relocate, which would prevent reuse like this, but we don't have it yet.
I think it's fine either way.
| const auto extra_properties_files_dir = extra_properties_files_dir_name(); | ||
| std::vector<std::filesystem::path> extra_properties_file_names; | ||
| if (utils::file::exists(extra_properties_files_dir) && utils::file::is_directory(extra_properties_files_dir)) { | ||
| utils::file::list_dir(extra_properties_files_dir, [&](const std::filesystem::path&, const std::filesystem::path& file_name) { | ||
| if (!file_name.string().ends_with(".bak")) { | ||
| extra_properties_file_names.push_back(file_name); | ||
| } | ||
| return true; | ||
| }, logger_, /* recursive = */ false); | ||
| } | ||
| std::ranges::sort(extra_properties_file_names); | ||
| for (const auto& file_name : extra_properties_file_names) { | ||
| properties_files_.push_back(extra_properties_files_dir / file_name); | ||
| } | ||
|
|
||
| std::ifstream file(properties_file_, std::ifstream::in); | ||
| if (!file.good()) { | ||
| logger_->log_error("load configure file failed {}", properties_file_); | ||
| return; | ||
| logger_->log_info("Using configuration file to load configuration for {} from {} (located at {})", | ||
| getName().c_str(), configuration_file.string(), base_properties_file_.string()); | ||
| if (!extra_properties_file_names.empty()) { | ||
| auto list_of_files = utils::string::join(", ", extra_properties_file_names, [](const auto& path) { return path.string(); }); | ||
| logger_->log_info("Also reading configuration from files {} in {}", list_of_files, extra_properties_files_dir.string()); | ||
| } | ||
|
|
||
| properties_.clear(); | ||
| dirty_ = false; | ||
| for (const auto& line : PropertiesFile{file}) { | ||
| auto key = line.getKey(); | ||
| auto persisted_value = line.getValue(); | ||
| auto value = utils::string::replaceEnvironmentVariables(persisted_value); | ||
| bool need_to_persist_new_value = false; | ||
| fixValidatedProperty(std::string(prefix) + key, persisted_value, value, need_to_persist_new_value, *logger_); | ||
| dirty_ = dirty_ || need_to_persist_new_value; | ||
| properties_[key] = {persisted_value, value, need_to_persist_new_value}; | ||
| for (const auto& properties_file : properties_files_) { | ||
| std::ifstream file(properties_file, std::ifstream::in); | ||
| if (!file.good()) { | ||
| logger_->log_error("load configure file failed {}", properties_file); | ||
| continue; | ||
| } | ||
| for (const auto& line : PropertiesFile{file}) { | ||
| auto key = line.getKey(); | ||
| auto persisted_value = line.getValue(); | ||
| auto value = utils::string::replaceEnvironmentVariables(persisted_value); | ||
| bool need_to_persist_new_value = false; | ||
| fixValidatedProperty(std::string(prefix) + key, persisted_value, value, need_to_persist_new_value, *logger_); | ||
| dirty_ = dirty_ || need_to_persist_new_value; | ||
| properties_[key] = {persisted_value, value, need_to_persist_new_value}; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are 2 separate sections here, doing 2 separate things that can be encapsulated, I would extract them to 2 separate functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have refactored loadConfigureFile by pulling out two self-contained sections: edf88de
| const auto output_file = (persist_to_ == PersistTo::SingleFile ? base_properties_file_ : extra_properties_files_dir_name() / C2PropertiesFileName); | ||
| if (!std::filesystem::exists(output_file)) { | ||
| logger_->log_debug("Configuration file {} does not exist yet, creating it", output_file); | ||
| utils::file::create_dir(output_file.parent_path(), /* recursive = */ true); | ||
| std::ofstream file{output_file}; | ||
| } | ||
|
|
||
| std::ifstream file(output_file, std::ifstream::in); | ||
| if (!file) { | ||
| logger_->log_error("load configure file failed {}", properties_file_); | ||
| logger_->log_error("Failed to load configuration file {}", output_file); | ||
| return false; | ||
| } | ||
|
|
||
| auto new_file = properties_file_; | ||
| new_file += ".new"; | ||
|
|
||
| PropertiesFile current_content{file}; | ||
| for (const auto& prop : properties_) { | ||
| if (!prop.second.need_to_persist_new_value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would extract reading the PropertiesFile content to a separate function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pulling out the file-reading part is not so easy because of the return false in line 219/257.
I have pulled out the next section, which sets the changed properties in current_content: 9d355e8
README.md
Outdated
| these are the default settings supplied by the latest MiNiFi version. If you would like to modify these, you should create a corresponding | ||
| .d directory (e.g. conf/minifi.properties.d) and put your settings in a new file inside this directory. These files are read and applied | ||
| in lexicographic order, after the default settings file. | ||
| The Windows installer creates a conf/minifi.properties.d/10_installer_properties file, which contains C2 connection settings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The files under minifi.properties.d should have the .properties extension too by convention, but it's OK if the minifi service doesn't enforce this and reads all files under the directory.
| The Windows installer creates a conf/minifi.properties.d/10_installer_properties file, which contains C2 connection settings. | |
| The Windows installer creates a conf/minifi.properties.d/10_installer.properties file, which contains C2 connection settings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know about convention: most of the files under /etc/*.d on my machine have no extension, with .conf a distant second and many others like .driver, .sources etc. I have changed them to .properties anyway, in 12e6d40.
When reading them, we will continue to read all files in the .d directory except backup files.
Currently, MiNiFi modifies the installed config files (minifi.properties and config.yml) while it runs. This confuses the Windows installer, so changes to these config files get lost when MiNiFi is upgraded. After this change, the minifi.properties, minifi-log.properties and minifi-uid.properties files are no longer modified by MiNiFi at runtime, so they can be safely replaced by new versions during upgrade. All changes to the settings should be put into new files in the minifi.properties.d (minifi-log.properties.d, minifi-uid.properties.d) directory; these new files will not be touched by the upgrade. The config.yml file will no longer be installed as part of MiNiFi. If C2 is enabled, config.yml will be fetched from the C2 server; otherwise, MiNiFi will create a new file with an empty flow, and the user can edit this. Either way, config.yml will not be touched by an upgrade.
... and use a for loop instead of while
12e6d40 to
5894882
Compare
|
|
||
| namespace { | ||
| void createDefaultFlowConfigFile(const std::filesystem::path& path) { | ||
| std::ofstream ostream(path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a bit of error reporting
| std::ofstream ostream(path); | |
| std::ofstream ostream(path); | |
| ostream.exceptions(std::ofstream::failbit | std::ofstream::badbit); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
applied
| these are the default settings supplied by the latest MiNiFi version. If you would like to modify these, you should create a corresponding | ||
| .d directory (e.g. conf/minifi.properties.d) and put your settings in a new file inside this directory. These files are read and applied |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking it would be more user-friendly if we shipped the directory. One option could be to ship a symlink in it to minifi.properties like /etc/sysctl.d. Alternatively we should ship a README or an empty file just to keep the directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to ship all three directories: minifi.properties.d, minifi-log.properties.d and minifi-uid.properties.d, or only minifi.properties.d?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to think more about this, let me get back to this after the holidays. Now I'm thinking about a single minifi.properties.d with things like 10-general.properties, 10-c2.properties, 10-log.properties, 10-uid.properties, and they're symlinks to the old filenames under conf/, but I'm not sure yet this would be the right approach.
| #include "utils/Environment.h" | ||
|
|
||
| namespace { | ||
| bool fileContentsMatch(const std::filesystem::path& file_name, const std::unordered_set<std::string>& expected_contents) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(thread reply) I think move is valid: the moved-from state is messy in general, some people say invalid, the standard says unspecified but valid state. But if it's reassigned (by getfile), then it probably doesn't matter. I would be a proponent of a destructive move / relocate, which would prevent reuse like this, but we don't have it yet.
I think it's fine either way.
| utils::file::list_dir(extra_properties_files_dir, [&](const std::filesystem::path&, const std::filesystem::path& file_name) { | ||
| if (!file_name.string().ends_with(".bak")) { | ||
| extra_properties_file_names.push_back(file_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed my mind about the pattern matching: I think I'd rather have a whitelist by matching extensions, rather than a blacklist by *.bak. It's more explicit about what we want to read, and it leaves room for things like a README.txt like in /etc/sysctl.d on my system. By the way, another pattern I was thinking about following was /etc/X11/xorg.conf.d.
What do you think? As always, I'm open to different viewpoints and to being wrong.
| return *file_name_; | ||
| gsl_Expects(!file_locations_.empty()); | ||
| return file_locations_.front().filename(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this function still make sense? If it's using a list of files now, and exposes a single file name, then I'd rather change it to not expose a single file name instead of just taking the first one in the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is still used when we serialize the checksum into the heartbeat. I agree that e.g. using a "checksum name" passed to the constructor would be nicer, but it would add a lot of extra code. Instead, I have renamed the function to getFileNameOfFirstFileLocation() in d06d435.
Co-authored-by: Márton Szász <[email protected]>
https://issues.apache.org/jira/browse/MINIFICPP-2342
Currently, MiNiFi modifies the installed config files (minifi.properties and config.yml) while it runs. This confuses the Windows installer, so changes to these config files get lost when MiNiFi is upgraded.
After this change, the minifi.properties, minifi-log.properties and minifi-uid.properties files are no longer modified by MiNiFi at runtime, so they can be safely replaced by new versions during upgrade. All changes to the settings should be put into new files in the minifi.properties.d (minifi-log.properties.d, minifi-uid.properties.d) directory; these new files will not be touched by the upgrade.
The config.yml file will no longer be installed as part of MiNiFi. If C2 is enabled, config.yml will be fetched from the C2 server; otherwise, MiNiFi will create a new file with an empty flow, and the user can edit this. Either way, config.yml will not be touched by an upgrade.
Thank you for submitting a contribution to Apache NiFi - MiNiFi C++.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?
Does your PR title start with MINIFICPP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically main)?
Is your initial contribution a single, squashed commit?
For code changes:
For documentation related changes:
Note:
Please ensure that once the PR is submitted, you check GitHub Actions CI results for build issues and submit an update to your PR as soon as possible.