Internet-Draft Recursively Setting Attributes October 2024
Zhang, et al. Expires 17 April 2025 [Page]
Workgroup:
Network File System Version 4
Internet-Draft:
draft-mzhang-nfsv4-recursively-setting-06
Published:
Intended Status:
Standards Track
Expires:
Authors:
M. Zhang
Huawei Technologies
S. Bhargo
Broadcom Inc.
R. Parambattu
Huawei Technologies
D. Geng
Huawei Technologies
Y. Du
Huawei Technologies

Recursively Setting Attributes of Subdirectories and files

Abstract

In the recent years, the concept of near-data computing has been widely recognized in storage architectures. The core idea is to process data nearby, reduce the overhead of network transmission, and utilize the computing capability of smart devices (such as intelligent NICs, smart SSDs, and DPUs). This reduces CPU and memory usage of clients (computing nodes) and improves data processing efficiency. This design idea is applied in NFSv4.2 or future NFS versions, such as Server-Side Copy, in which client sends the control command and the storage server copies data without transmitting between client and server. We are proposing a new mechanism for setting the attributes for all the files and directories in the parent directory, based on thes same thinking of the server side copy mechanism. Compared with traditional setting of attributes, data transmission over the network is reduced and the bandwidth resources are greatly released.

Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 17 April 2025.

Table of Contents

1. Problem Statement

In actual storage applications, users often recursively set the attributes of directories and subitems(their subfiles and subdirectories). Message interaction between client and server is complex, and the client consumes a lot of resources, which does not match the concept of near-data computing. FIG. 1 sequence diagram shows the existing sequence of recursively setting the attributes of all files under directory.

Step 1: The client sends the READDIR command to obtain the list of all files in dir1.

Step 2: The storage server responds to the READDIR operation. If the directory contains many subdirectories and files the client needs to run the READDIR operation for multiple times.

Step 3: The client sends a SETATTR request for each subdirectory and file.

Step 4: The storage server responds to the SETATTR request.

If the parent directory contains 100,000 files, the client needs to repeat step 3 and 4 for 100,000 files. The whole process consumes more CPU resources and memory resources of the client, and a large number of RPC messages are exchanged between the client and the storage server. As a result, an end-to-end time for the attribute set operation is relatively long.

preamble to the figure.

                 Client                                Server
                 +                                       +
                 |                                       |
                 |------ READDIR ----------------------->|
                 |<--------------------------------------|
                 |------ GETATTR ----------------------->|
                 |<--------------------------------------|
                 |------ SETATTR ----------------------->|
                 |<--------------------------------------|
                 |         ....                          |
                 |                                       |

       Figure 1: Existing flowchart for recursive set operation

As you can see, this figure doodled and dawdled.

Similar to the design of Server-Side Copy, in this proposal we propose four new operations to be used to recursively set the attributes of a directory and its subdirectories and subfiles. These operations can be in synchronous or asynchronous mode. These four new operations are RECURSIVE_SET, RECURSIVE_SET_STATUS, RECURSIVE_SET_CANCEL and CB_RECURSIVE_SET_NOTIFY.

RECURSIVE_SET is used by client to request setting of attributes of the directories and files.

RECURSIVE_SET_STATUS is used by client to query the status of the recursively set operation requested by operation RECURSIVE_SET.

RECURSIVE_SET_CANCEL is used by client to cancel the recursively set operation.

CB_RECURSIVE_SET_NOTIFY is used by server to notify client that the recursively set operation finished. This operation is used only in asynchronous mode.

2. Protocol Overview

In the proposed new mechanism, client has the option of using RECURSIVE_SET in synchronous or asynchronous mode. The server has a concept of initial timeout for synchronous mode, which is recommended to be 1/3 of lease time.

After adopting the concept of near data calculation, the above scenario can be optimized.

Step 1: The client identifies that the object of the attribute setting is a directory and the attribute setting is recursive, and invokes the new operation RECURSIVE_SET in compound request, e.g.

Compound request:

SEQUENCE

PUTFH (directory filehandle)

RECURSIVE_SET

SETATTR

RECURSIVE_SET_STATUS

Step 2: The storage server receives the compound request with RECURSIVE_SET operation before SETATTR, server identifies the filehandle as a directory filehandle create a recursively set task and start recursively querying all files in the directory, sets attributes for each file or directory. If filehandle refers to a regular file, server SHOULD return NFS4ERR_NOTDIR.

Step 3: The storage server responds to the request once the recursive set operation completes setting attributes of all subdirectories and files. RECURSIVE_SET can be one of the two types either synchronous or asynchronous.

If client choose synchronous RECURSIVE_SET, server must respond to the client, once server finishes the operation. If the server fails to complete the attribute set within the timeout, the server responds to the client with the error code NFS4ERR_PENDING, with recursive task id and verifier to the client.Client queries the result periodically till the operation is completed on server side.

If client choose asynchronous setting, server will immediately return the error code NFS4ERR_PENDING for RECURSIVE_SET_STATUS operation with recursive task id and verifier to the client and client will start an observer task to monitor the server. Server will send callback operation CB_RECURSIVE_SET_NOTIFY to the client once server finishes RECURSIVE_SET operation. Client will terminate the observer task once client receives the callback notification from server.

Compared to the original iterative process, the proposed process not only saves the CPU and memory usage of the client, but also significantly reduces the number of RPC’s exchanged between the client and server. This greatly improves the performance of setting attributes in subdirectories and files.

o If no backchannel is created when the client and server establish a connection, the client can only use the synchronous mode in the RECURSIVE_SET request. If the client uses the asynchronous mode, the server returns the error code NFS4ERR_CB_PATH_DOWN.

o If a backchannel is already established the client can choose to use synchronous or asynchronous mode.

Server reboot

When server reboot, the client will get NFS4ERR_BADSESSION. Client SHOULD retry the RECURSIVE_SET operation after re-establishing the Clientid and after RECLAIM_COMPLETE procedure.

Client Lease Expiry

If the client sends the RECURSIVE_SET operation and later there is a network disruption between the client and server, the client lease may expire. After the lease expiration the server will terminate the RECURSIVE_SET operation, which might result in partially modified files/directories under the parent directory on which the RECURSIVE_SET operation was executed.

RECURSIVE_SET operation is tied to specific client instance, so if the client lease has expired the server should cancel the RECURSIVE_SET operation. In case of there are huge number of files need to be set attributes, the server can determine the timeout but the timeout must be lesser than lease time.

Backchannel Consideration

Before client initiate the RECURSIVE_SET operation to the server, the client MUST check if the client has a backchannel established with the server. If there is no backchannel then client MUST use only synchronous RECURSIVE_SET operation. If there is an existing backchannel then the client can use either synchronous or asynchronous RECURSIVE_SET operationrecursively setting. If the server wants to send a callback operation over the backchannel of a session and no backchannel exists for the session, the server cannot establish the backchannel because only the client can associate connections with the backchannel. If there is no such connection, the server indicate that the session has no backchannel by setting the SEQ4_STATUS_CB_PATH_DOWN_SESSION flag bit in the response to the next SEQUENCE operation from the client. The client then associate a connection with the session.

Grace Consideration

RECURSIVE_SET operation must honor the server grace time. During server grace period, server should send NFS4ERR_GRACE to the client and the client should retry the request till the grace period is over.

Position Consideration

RECURSIVE_SET operation MUST not be the first operation of the compound request and compound operation containing the RECURSIVE_SET op should always have the SEQUENCE as the first operation.

Note to RFC Editor: this section may be removed on publication as an RFC.

3. Implementation Considerations

A recommended Recursive Set operation in synchronous mode is shown in Figure 2.

Step 1: The client sends a RECURSIVE_SET request. In the request, rsa_sync must be set to true.

Step 2: If the storage server completes to recursively set the attributes within the timeout period, the storage server returns the result back to the client. If the attributes are not set within the timeout period, the server must generate rsr_callback_id and rsr_recursiveverf and return back to client. In addition, server must respond the client with NFS4ERR_PENDING.

Step 3: The client sends a RECURSIVE_SET_STATUS query request. The request contains the information of rss_recursive_taskid. The value of rss_recursive_taskid should be set to rsr_callback_id which is obtained from the response of RECURSIVE_SET operation if the value of rss_recursive_taskid is the same as the value of rsr_callback_id cached on the storage server, the storage server returns the current status of the attribute set operation. Storage server return NFS4_OK if the server has set all the attributes, or NFS4ERR_PENDING if the operation is still in progress. If the server has encountered error during the attribute setting, then the result code must be cached and must be set in the response. If the value of rss_recursive_taskid in the request is different from the value cached on the server, the storage server returns the error code NFS4ERR_INVAL.

Step 4: The client decodes the response. If the response is NFS4_PENDING, the client would retry the RECURSIVE_SET_STATUS operation again, after a delay period. If the error code returned by the server is NFS4_OK, the recursive attribute setting is successful. If SETATTR operation has encountered an error, the recursive attribute setting fails. In this case, the client returns a response to the application.

preamble to the figure.

                 Client                                                     Server
                 +                                                             +
                 |                                                             |
                 |------ RECURSIVE_SET(rsa_sync = 1) ------------------------> |
                 |                                                             |
                 |<-----Response(rsr_callback_id = 0, rsr_recursiveverf = 0)---|  within the timeout period
                 |                                                             |
                 |                                                             |
                 |<----Response(rsr_callback_id = 1, rsr_recursiveverf = 1)----|  beyond the timeout period
                 |                                                             |
                 |                                                             |
                 |                                                             |
                 |-------RECURSIVE_SET_STATUS(rss_recursive_taskid = 1)------> |
                 |                                                             |
                 |<------Response--------------------------------------------- |
                 |                                                             |
                 |                                                             |

                           Figure 2:  A synchronous Recursive Set

As you can see, this figure doodled and dawdled.

An alternative Recursive Set operation in asynchronous mode is also given in Figure 3.

Step 1: The client sends a RECURSIVE_SET request. In the request, rsa_sync flag should be set to false.

Step 2: The storage server needs to generate rsr_callback_id and rsr_recursiveverf, and set the error code to NFS4ERR_PENDING. The storage server continue executing the recursive setting operation.

Step 3: After receiving the response, and if the error code is NFS4ERR_PENDING, the client starts an asynchronous task to monitor the progress of RECURSIVE_SET.

Step 4: The client waits an asynchronous message from the server and matches rsr_callback_id and rsr_recursiveverf. Client matches rsr_callback_id and rsr_recursiveverf in CB_RECURSIVE_SET_NOTIFY, and if both the parameters match then the response is a valid response. If rsr_callback_id can be matched but rsr_recursiveverf cannot be matched, client skip the message.

Step 5: If the client does not receive the asynchronous message, the asynchronous task is forcibly terminated when the session is destroyed.

If an error occurs when the storage server recursively set attributes of subdirectories and files, the storage server terminates the task and returns the error code to the client. All possible errors are subject to the error codes defined by SETATTR.

preamble to the figure.

                 Client                                                     Server
                 +                                                             +
                 |                                                             |
                 |------ RECURSIVE_SET(rsa_sync = 0) ------------------------->|
                 |                                                             |
                 |<------Response(rsr_callback_id = 1, rsr_recursiveverf = 1)--|
                 |                                                             |
                 |                                                             |
                 |<------CB_RECURSIVE_SET_NOTIFY-------------------------------|
                 |                                                             |
                 |                                                             |
                 |                                                             |

                            Figure 3: An asynchronous Recursive Set

As you can see, this figure doodled and dawdled.

4. Recursive Set Operations

4.1 Operation TBD1: RECURSIVE_SET – Recursively sets the attributes of a directory and its subdirectories and files.

ARGUMENT

Struct RECURSIVE_SET4args {

bool rsa_sync;

};

RESULT

struct recursive_set_response4 {

recursive_taskid4 rsr_callback_id;

verifier4 rsr_recursiveverf;

};

union RECURSIVE_SET4res (nfsstat4 rsr_status) {

case NFS4_OK:

recursive_set_response4 rsr_resok4;

default:

void;

};

DESCRIPTION

The RECURSIVE_SET operation is used by the client to recursively set the attributes of a directory and all its subdirectories and files. The operation should be placed before SETATTR in the compound operation. After the storage server receives the SETATTR combination operation, if the SETATTR operation is not preceded by RECURSIVE_SET, the original process remains unchanged. If the SETATTR operator is preceded by the RECURSIVE_SET operation, the storage server considers the attributes of the directory and its subdirectories and files to initiate recursive set mode.

If the storage is successfully executed, the values of rsr_callback_id and rsr_recursiveverf are 0.

If the recursive SETATTR operation in storage is not complete within the timeout period, the values of rsr_callback_id and rsr_recursiveverf are generated.

If rsa_sync is set to true, then client can choose one of the below implementation.

1. After the client receives the response for RECURSIVE_SET as NFS4ERR_PENDING, the client waits for a period of time and executes RECURSIVE_SET_STATUS to query the execution progress of the current task. If the recursive attribute setting is still in progress, NFS4ERR_PENDING is returned. The recommended period of time can be set half the lease time. The client continuous to poll till the client receives CB_RECURSIVE_SET_NOTIFY from server or NFS4ERR_OK for the RECURSIVE_SET_STATUS request.

2. The client can choose to wait for the CB_RECURSIVE_SET_NOTIFY from the server to know if the recursive set of attributes are completed.

4.2 Operation TBD2: RECURSIVE_SET_STATUS – Query the result of the recursively setting the attributes of subdirectories and files

ARGUMENT

struct RECURSIVE_SET_STATUS4args {

stateid4 rssa_recursive_taskid;

};

RESULT

#define NFS4ERR_PENDING 10090

struct RECURSIVE_SET_STATUS4res {

nfsstat4 rssr_status;

};

DESCRIPTION

rssa_recursive_taskid is the value same to rsr_callback_id in RECURSIVE_SET response. The RECURSIVE_SET_STATUS operation is used by the client to query the status of a recursively set task (attributes of subdirectories and files). Server must check if rssa_recursive_taskid match the task id in server and if the task on the storage server is complete, NFS4_OK is returned. If any error occurs during task execution, a response error code is returned and the error code is not extended or modified in this case so the error code is the same as the error code that may occur during the SETATTR operation. If the current setting task is not complete, NFS4_PENDING is returned.

4.3 Operation TBD3: RECURSIVE_SET_CANCEL – Canceling a Running Task on the Client

ARGUMENT

struct RECURSIVE_SET_CANCEL4args {

stateid4 rsca_recursive_taskid;

};

RESULT

struct RECURSIVE_SET_CANCEL4res {

nfsstat4 rscr_status;

};

DESCRIPTION

RECURSIVE_SET_CANCEL is used to cancel the task that is being executed. The request packet contains rsca_recursive_taskid. The value of rsca_recursive_taskid is obtained from the response of RECURSIVE_SET. If the storage server fails to cancel the task, NFS4ERR_DELAY is returned. When receiving the message, the client delays the retry. If the current task is complete, NFS4_OK is returned.

4.4 Operation TBD4: CB_RECURSIVE_SET_NOTIFY – Notify the recursively setting result to client

ARGUMENT

struct CB_RECURSIVE_SET_NOTIFY4args {

nfs_fh4 crsna_fh;

stateid4 crsna_recursive_taskid;

verifier4 crsna_recursiveverf;

nfsstat4 crsna_status;

};

RESULT

struct CB_RECURSIVE_SET_NOTIFY4res {

nfsstat4 crsnr_status;

};

DESCRIPTION

CB_RECURSIVE_SET_NOTIFY is used to send the server callback to client to notify the client of the result of the task of recursively setting the attributes of subdirectories and files. Client check the crsna_recursive_taskid and crsna_recursiveverf and client will finish the wait task if arguments match the value received from previous RECURSIVE_SET response or will skip the notification in case of not match and return NFS4ERR_INVAL to server.

Race condition between CB_RECURSIVE_SET_NOTIFY and RECURSIVE_SET_STATUS. A race condition can happen if the RECURSIVE_SET_STATUS is in flight and server has responded with CB_RECURSIVE_SET_NOTIFY. In this case the server would have cleaned up the recursive_taskid before the RECURSIVE_SET_STATUS is received from client. The server may return NFS4ERR_INVAL, and this should be gracefully handled by the client.

5. Security Considerations

TBD

6. IANA Considerations

TBD

7. References

7.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC7862]
Haynes, T., "Network File System (NFS) Version 4 Minor Version 2 Protocol", RFC 7862, DOI 10.17487/RFC7862, , <https://www.rfc-editor.org/info/rfc7862>.

7.2. Informative References

[InfRef]
"", .

Appendix A. An Appendix

Authors' Addresses

Minqian Zhang
Huawei Technologies
1899 Xiyuan
Chengdu
High-tech West District, 611731
China
Sunil Kumar Bhargo
Broadcom Inc.
Phone: +
Rijesh Kunhi Parambattu
Huawei Technologies
Dongyu Geng
Huawei Technologies
Yunfei Du
Huawei Technologies