Merged
Conversation
Optional arguments always/never used have a peculiar semantic. If a function is only used internally, then the optional arguments reports only contain references to the implementation (`.ml`). If it is only used externally, and there is an interface (`.mli`), then they only contain references to the interface (`.mli`). However, when it is used both internally and externally, then they contain references to both, depending on the internal and external uses of the optional arguments. This is why the optional arguments reports may appear as "duplicated" : the same optional argument is referenced as always/never used both in the `.ml` and the `.mli`. The expected reports for optional arguments in `preprocessed_lib/preprocessed.ml` were therefore invalid because they were "unified", as if internal and external uses all counted as internal uses. By following the described semantics, the `?internal` (resp. `?external`) optional argument is now expected to be reported as always used for the `.ml` (resp. `.mli`). Similarly for the `?external` (resp. `?internal`) which is expected to be reported as never used in the `.ml` (resp. `.mli`). Because of this dissociation, the `?external` and `?internal` arguments are not expected to be reported as almost always/never used anymore in the threshold-3-0.5 test scenario.
The new tests are witnesses of the non-dissociation of the internal and external uses of the optional arguments. In particular, the optional arguments of the `Preprocessd.externally_used_f` function should only be reported with locations in `preprocessed_lib/preprocessed.mli` but are currently wrongly reported in `preprocessed_lib/preprocessed.ml`, as if there was no `.mli` like in the case of `Preprocessed_no_intf`.
Rather than removing the `.pp` extension in `Utils.unit`, it is directly removed from the sourcepaths read in `.cmt` files. This way, compilation units remain consistent; and file paths read in locations are now also consistent with the sourcepaths. More details on why this fixes the observed FN and FP below. With this change, I measured a gain of ~10s (from ~22s to ~12s) when running the analyzer with `-v --all` on [Frama-C 31.0](https://git.frama-c.com/pub/frama-c/-/tree/31.0?ref_type=tags) on my machine. This is coherent with the few computations of the cmts' sourcepaths (once per `.cmt` read) vs the amount of time the compilation unit is computed (multiple times on every potentially meaningful location). As mentionned in f090928, there was a difference between the compilation unit of a preprocessed file and the one of an unpreprocessed file. Forcing them to have the same compilation unit by removing the `.pp` extension is sufficient in most cases. However, in the case of optional arguments, some computations relied on finding whether a location belongs to the current or a previously analyzed sourcefile, or if it belongs to an unknown (potentially analyzed later) sourcefile. In particular, `VdNode.eof` relies on this to discard "internal" locations and, thus, dissociate them from their corresponding "external" locations. This property is checked in `DeadCommon.VdNode.seen` by verifying if the location has a corresponding sourcepath in the `DeadCommon.abspath` table. This table associates compilation units (as obtained by `Utils.unit`) with sourcepaths (read in the `.cmi` and `.cmt` files) and is filled when starting the analysis of either a `.cmi` or a `.cmt`. Verifying that a location belongs to a known file is done by checking if its filename is a suffix of one of the sourcepaths associated with its compilation unit. The sourcepath of a `.cmt` file may end with `.pp.ml`, while locaitons found in the typedtree will not contain the `.pp` part. Therefore, the inconsistency observed in f090928 still remains for sourcepaths. Because of this inconsistency, the internal and external locations are not dissociated at the end of the analysis of a `.cmt`, leading to the observed FN and FP.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix #47
The
.expfiles are updated to reflect the expected dissociation and new tests are added to witness details of the optional arguments reporting semantics (see #47).The invalid results were due to an inconsistency between sourcepaths built from
cmt_infos.cmt_sourcefileand filenames found in the typedtrees' locations. The former could contain a.ppextension, unlike the latter. The.ppwas already removed from compilation units for consistency (f090928).Because of this inconsistency,
DeadCommon.VdNode.eofwould not be able to discard opt args "internal" declarations and, thus, dissociate them from their corresponding "external" declarations. Therefore, the internal and external uses of a value were all linked to the internal declaration, and the reports reflected this.Removing
.ppdirectly from the sourcepath rather than on compilation units fixes the inconsistency and ensures that it is mechanically absent from compilation units.This change improves the performance on Frama-C 31.0. I measured a gain of ~10s (from ~22s to ~12s) when running the analyzer with
-v --allon my machine, without any noticeable impact on memory consumption.The change in results only affects the optional arguments reports. I reviewed 25% of it in-depth and another 25% quickly to confirm the update works as expected and only brings fixes.