The Resource logo

The Resource

Archives
Subscribe
November 28, 2022

The November 2022 Edition of The Resource

Hello Reader, here is this month’s iRODS news and developments!

If you’re facing an issue with iRODS you’re not sure how to solve, please do drop me a line; if I’ve come across a solution or seen something relevant elsewhere, I’ll do my best to let you know. Or just drop me a mail to say ‘Hi’. Always nice to hear from people, particularly in these pandemic times!

I’d love your thoughts and feedback on how this newsletter could be better for you.

News

November has been a quiet month!

Not for me - I’ve been working through the maze of dependencies, upgrading 200+ systems from 4.2.7 to 4.2.11. Not there yet - my dev systems are updated, but still have some issues. Hopefully I’ll be able to report successful production upgrades in the December newsletter. What have you been working on?

I’ve joined Mastodon, as many are at the moment, feel free to connect with me at @kript@mastodon.theultraworld.org. Not all my tweets are iRODS related, though!

Main Repository Activity

Open Issues

​Incorrect use of gethostname() and HOST_NAME_MAX​

iRODS misuses POSIX's gethostname() by passing a buffer of size HOST_NAME_MAX.

​irods::server_properties::map() can result in a data race​

Use of the map() member function (as seen in the snippet below) is very convenient as it grants access to the underlying nlohmann::json object directly. However, it can result in data races because it bypasses the synchronization mechanisms used internally by the irods::server_properties instance.

​LDAP Integration Feature request​

Would you like iRODS to integrate with iRODS? What should it do when it loses connectivity to its LDAP system? Join the discussion above and let the maintainers know, especially if this is something you would find beneficial!

​irods-grid is sensitive to ordering of entries in /etc/hosts​

FQDN goes first, which is a convention, but it really shouldn’t break things.

​Document which configuration properties can be changed post setup​

Not all config properties are allowed to be changed post setup. Therefore, docs.irods.org needs to list which properties are safe to change post setup.

I think this refers to the unattended installation setting, where you pass the setup script a JSON file. There is nothing stopping you doing this multiple times, however the changes are not always idempotent. See the next issue!

​Document unattended installation’s overwrite behavior​

Unattended installs will completely overwrite the contents of server_config.json and irods_environment.json. This can lead to a nonfunctional server if information shared between the config files and the database become out of sync.
docs.irods.org needs to include a few statements about this behavior.

​Rename base64_encode and base64_decode​

I have found when testing the Globus plugin (client to iRODS) that we appear to be linking to a version of base64_encode and base64_decode that is not the intended version. These function names are too generic and should be placed into a namespace or something.

​Have resource server use its default_file_mode configuration value when creating local replicas​

Enhancement request;

One use case of iRODS is to colocate a data consuming service on the resource server hosting the data for this service. The service needs file system level permission to access the data, which can be configured using default_file_mode. Currently, iRODS chooses the value of default_file_mode set on the iRODS server the uploading client connects to, ii.e., the client's irods_host configuration value. If the client connects to the zone's canonical iRODS host, which may be a load balancer, it is likely that the client won't connect to the colocated resource server, and this server's default_file_mode won't be chosen. To prevent this from happening, the default_file_mode on all of the iRODS servers needs to be set to the value required on the colocated one. This isn't intuitive, and it's not always desirable.
Could iRODS be changed so that the selected default_file_mode come from the resource server hosting the storage resource chosen for a new replica? Or maybe the unixfilesystem resource could be modified to accept file mode as a context value.

​msiSetDefaultResc / acSetRescSchemeForCreate no longer force Resource write when incorrect resource given​

This is working as designed.
This bugfix was part of #4084 for 4.2.9.
Diff for the docs: irods/irods_docs@d2631f0
Note the difference in the last row of the tables in 4.2.7:
https://docs.irods.org/4.2.7/system_overview/configuration/#default-resource-configuration
vs 4.2.11
https://docs.irods.org/4.2.11/system_overview/configuration/#default-resource-configuration

​irods::client_api_allowlist::enforce is marked noexcept, but can throw exceptions​

​Document when the server requires a restart in regard to SSL configuration changes​

FTAOD restart the server when you do this until a more canonical answer emerges. I can verify that iRODS will continue to read the old one unless you do, at least for some of the processes - enough to basically stop it working. This is puzzling because stracing the server when it starts a new rodsAgent process shows it reading the cert file.

​test_ifsck__2650 test failure​

Failure between 4.3.0 and 4.3.1, so current 4-3-stable

​Refactor user administration API to throw exceptions instead of return error codes​

​Re-enable test_auth​

​Add remove_if_exists(file) and make_arbitrary_file to lib.py​

These are in the s3 plugin (s3plugin_lib.py). They have been requested to be added to lib.py.

​irodsDelayServer does not start because FQDN is configured as delay_server leader​

This seems to be caused by the fact that the initial delay_server leader is set to the FQDN of my iRODS host, while the code uses the (short) hostname for comparison.

​Error when calling msiDataObjChksum from acPostProcForFilePathReg​

Initial report;

We had this configuration in iRODS 4.2.8, and we never got this error. I think because acPostProcForFilePathReg was not triggered when an object was put. This seems to have changed somewhere between 4.2.8 and 4.3.0.
We used this rule to create a checksum for objects that are registered using ireg.

and response;

Yes, in 4.2.9, with the work done to unify many of the codepaths and provide logical locking - the registration for both 'large' AND 'small' files now happens prior to data being written to disk - as a placeholder for the locking to have a thing to hold.
In 4.2.8 and before, registration-before-data-on-disk would have only happened for 'large' files that triggered parallel transfer (default, >32MB).
Please try pep_api_phy_path_reg_post() instead... I believe this will not fire for an iput, but will fire for ireg.

​cross-zone connections between irods servers in non-federated setting​

For easy reproduction, the following should trigger the bug:
Run an irods server version 4.2.11
Run iinit from another host with icommands version 4.2.10 or 4.2.11 and a readable /etc/irods/server_config.json containing only:
{
    "negotiation_key": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
    "zone_key": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
}

​Document mapping between client-side APIs and dynamic PEPs​

​Allow connection_pool and client_connection to refresh connections after N API requests​

#6593 means we need to consider a few things regarding long running agents:
How do we handle memory leaks?
How do we handle admins modifying policy, primarily for the iRODS rule language?
One way to get around this is to introduce a counter that is associated with each connection. This counter will represent the number of requests processed by the agent. Once the counter exceeds N requests, the connection is replaced by a fresh connection.
Replacing the connection enables the following:
The agent servicing the requests is shut down
Memory leaked by the agent is returned to the OS
The new agent sees the updated policy

​iadmin mkzone should report an error when given invalid connection information​

​rstrcpy needs to log source string on error​

​Remove responsibility of freeing heap allocated memory from packstruct on API response​

​msiExecCmd_bin directory is owned by root:root after force-reinstalling packages via rpm​

​Deadlock in MySQL database plugin on many concurrent inserts​

The mysql function R_ObjectId_nextval() is not safe for parallel database updates, that some of our users can trigger by opening a large number of parallel connections.

​iunreg should instruct unlink API to skip vault check​

This message was added to leave signal that an attempt was made to unlink a file not found in an iRODS vault, leading to potential data loss. However, iunreg is not attempting to unlink the file; so, the log message is superfluous. A keyword called RESOURCE_SKIP_VAULT_PATH_CHECK_ON_UNLINK was provided to instruct the API to skip the vault check, and iunreg should provide this in its call to rcDataObjUnlink.

​document iquest attrs​

​Improper delay execution frequency time is Ignored or Misinterpreted​

Worth noting if you use the Delay rules with the Execution Frequency ()

​Update documentation for ibun/msiTarFileExtract()​

Closed Issues

Closed on - 2022-11-11 16:23:18 icommands should compile against the same C++ standard as the server​

Closed on - 2022-11-04 20:32:40 GitHub actions for clang-format and clang-tidy no longer work due to Ubuntu 22.04​

Closed on - 2022-11-04 20:32:30 User administration C++ library cannot query info about remote users​

Closed on - 2022-11-07 20:42:29 Remove rule_texts_for_tests.py​

Closed on - 2022-10-26 22:57:30 Administration libraries do not pass down include dirs properly​

Closed on - 2022-11-07 19:41:55 Document maximum_size_of_delay_queue_in_bytes​

​Documentation change;

maximum_size_of_delay_queue_in_bytes` (optional) (default 0) - The maximum number of bytes available to the delay queue. When set to 0, the delay server will use as much memory as it needs to hold queued rules.

Closed on - 2022-11-04 20:32:19 Don’t allow clang-format to format api_plugin_number_data.h​

Closed on - 2022-10-25 21:02:51 Clang-Tidy GitHub workflow cannot find catch2 headers​

Closed on - 2022-10-25 21:02:38 Memory leak in PackStruct unit test​

Closed on - 2022-10-19 19:13:47 Clang-Format: Disable Preprocessor formatting​

Closed on - 2022-11-09 21:42:08 Remove log_facility property from log message output​

Closed on - 2022-11-09 21:42:16 Add zone name to log message output​

Closed on - 2022-11-08 19:29:57 Refactor client_api_allowlist interface to match style of replica_access_table, etc​

Closed on - 2022-11-04 16:13:09 Deprecate SimpleQuery​

Closed on - 2022-10-20 21:10:43 iadmin fails to list user in case of particular username length combinations​

Targetted at 4.2.12

I have confirmed that replacing the SimpleQuery implementation with a GenQuery implementation resolves this issue. I am going to try to replace the other SimpleQuery uses in iadmin as part of this effort so that we can eventually remove it as this seems to be the last remaining holdout.

Closed on - 2022-11-07 20:45:23 closeAllL1desc should not call PEPs​

Closed on - 2022-11-09 21:41:44 Refactor resource administration API to throw exceptions instead of return error codes​

Closed on - 2022-10-25 21:02:15 Add C++ library for managing zones​

targetted at 4.3.1

The library should be modeled after the user/group administration library with the goal of providing a modern interface and simplifying usage of the zone management features provided by the iRODS C API function, rxGeneralAdmin.

Closed on - 2022-10-25 21:02:02 Expose utility functions used by the User Administration library​

Closed on - 2022-11-04 20:33:25 Allow identity of user attached to connection/agent to change in real time​

iRODS connections tie the identity of a user to the socket. This is fine for one-off commands, but not for situations where there can be hundreds to thousands of concurrent users. Creating a new connection for every user will quickly drain resources.
Therefore, iRODS should provide a way to change the user identity tied to the connection object. This would lead to huge improvements regarding performance, scalability, and resource management.
This would also improve support for client applications because the client libraries would finally be able to implement real connection pooling for iRODS connections.

Closed on - 2022-10-27 00:21:34 JSON Schema validation paths are incorrect for non-package installs​

Closed on - 2022-10-25 21:02:25 Consider adding feature test macros​

Closed on - 2022-10-20 15:01:47 ichmod should not be allowed to bypass the permission model​

ichmod is currently allowed to bypass the permission model when the user adjusting the permissions matches the original owner.
Only users with own permissions or an admin should be allowed to restore alice's permissions.

Closed on - 2022-11-04 19:17:26 Refactor / Modernize main server logic (rodsServer.cpp, etc.)​

Closed on - 2022-11-09 21:15:03 Delay server should not log a stack trace when default config value is used​

Closed on - 2022-10-21 20:38:10 Delay server adds completed rules back to queue, race condition, then complains loudly​

Closed on - 2022-11-08 21:14:33 server-side irods_environment.json doc and validation schema bugs​

Closed on - 2022-11-07 19:41:45 Client API Allowlist option does not align with description at docs.irods.org​

Closed on - 2022-11-09 21:15:27 add delay server memory usage default to server_config.json on upgrade​

Closed on - 2022-11-08 04:03:17 Non-admins should not be allowed to run iadmin lg​

Closed on - 2022-11-07 15:55:42 Atomic metadata update api lookup fails​

Closed on - 2022-11-04 20:03:24 provide client connection information to acPreConnect​

User requested to be able to determine whether to use SSL or not based on the incoming network connection (internal network or external). Original thread at https://groups.google.com/g/irod-chat/c/3afhUiB2A0k.
This would be done within acPreConnect(), however, there is no connection information available to that PEP to make an internal/external determination. It is handed a manufactured, empty rei.

Closed on - 2022-10-24 14:54:57 Fix delay hints parser for “DOUBLE” directives​

Closed on - 2022-10-24 15:32:56 Add detached mode to unixfilesystem plugin​

Very interested in this one;

Add detached mode to UFS.
This will be similar to what is done for the cacheless S3 plugin. Any resource server can serve up the request.

Python iRODS Client Activity

Open Issues

​Expose errno code in string representation of an irods.exception​

It was noted in this previous issue and comment that more information could be given in the product of repr(e), with e being an irods.exception returned to the PRC client from the iRODS server. Specifically the errno code provided by the OS (and propagated back to the client was e.code is 28 -> ENOSPC -> 'No space left on device'. That is essential information which rightfully should be expected as part of the repr(e) output.

Closed Issues

Closed on - 2022-11-13 16:04:45 Open socket connections can still cause log noise when gc-collected in Python​

Closed on - 2022-11-13 16:05:11 Generate SSL context from iRODS settings ​

In making an SSL connection more naturally following the configured irods_ssl_* settings allow the SSLContext to be automatically generated by the client library instead of relying on the user to provide a default-generated context

Closed on - 2022-11-10 05:21:23 Fix password_obfuscation in Windows​

The Python os library module in Windows does not implement the getuid() function. As a result, attempting to encode()/decode() a password with the password_obfuscation module results in an AttributeError. The getlogin() function, however, is implemented for Windows. Provided the login can be assumed to be as unique as a uid, this can be used as replacement salt for the process.

Closed on - 2022-10-18 17:41:59 Rule execution with a file with null input throws an error -1201000​

Closed on - 2022-10-22 17:23:28 Large put() over federation leaves “valid” replicas of incorrect size and checksum when interrupted​

 action is to move to 4.2.12 / 4.3.0! There are notes about various iRODS versions in the README already and it would be helpful to have an additional note to the effect that PRC put != iput and the consequences of how that interacts with different iRODS versions.

Closed on - 2022-11-12 13:59:27 irods.exception.SYS_NO_API_PRIV when groupadmin creates group​

Closed on - 2022-10-18 15:08:34 Rule execution should allow the ‘null’ input parameter for the rule_file (.r)​

YODA Activity

Open Issues

​[FEATURE] statistics overview split between research, vault, and total used.​

The statistics module shows total used storage by category and group. But no distinction is possible between research area, vault area, and total used storage.

​[FEATURE] support for subdomain in ansible parameter external_users_domain_filter and oidc_domains​

The configuration for OIDC domains works only for the root domain, for example surf.nl.
If the organization has users with emails with subdomains, they are not matched by the current Yoda rules.
For example mydepartment.surf.nl is not matched if the oidc_domains list contain surf.nl.

​[BUG] changing the subcategory in group properties for a datamanager group does not work​

Closed Issues

Closed on - 2022-10-28 10:31:35 [FEATURE] Data Access Password expiration notification​

When a Data Access Password has expired a WebDAV client will throw an authentication error, this might lead to unneeded support calls when researchers forget they have to set a new password themselves.
Describe the solution you'd like
Send an automatic notification to the user X hours (configurable for the instance) before a Data Access Password expires. The existing notification functionality in 1.8 seems suitable for this.

Will be released with v1.9.0.

Closed on - 2022-10-25 10:05:14 [FEATURE] DOI versioning​

DOI versioning - quite a detailed request.

Support for DOI versions is added in UtrechtUniversity/yoda-ruleset@2b15ff9

Will be released with v1.9.0.

If you think someone else would appreciate this newsletter, they can sign up at https://theresource.metadata.school/​

No Yaks were shaved in the making of this newsletter. It had to happen some time…

​

Don't miss what's next. Subscribe to The Resource:
Powered by Buttondown, the easiest way to start and grow your newsletter.