Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use dedicated SLABs #19

Closed
l0kod opened this issue Jan 30, 2024 · 18 comments
Closed

Use dedicated SLABs #19

l0kod opened this issue Jan 30, 2024 · 18 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@l0kod
Copy link
Member

l0kod commented Jan 30, 2024

Using a kmem_cache per Landlock's kernel type could improve performance, and it would also be useful to get some metrics via /proc/slabinfo.

One SLAB per:

  • struct landlock_object
  • struct landlock_rule
  • struct landlock_ruleset
  • struct landlock_hierarchy
@l0kod l0kod added enhancement New feature or request good first issue Good for newcomers labels Jan 30, 2024
@l0kod l0kod added this to Landlock Feb 1, 2024
@l0kod l0kod moved this to Ready in Landlock Feb 1, 2024
@l0kod l0kod mentioned this issue Mar 12, 2024
@ayush-0110
Copy link

Hi, I am Ayush, an outreachy applicant. I have worked on several patches and got them reviewed as mentioned in project guidelines. I want to work on this issue. Can you please assign it to me and help me get started on the same?
@l0kod @AlisonSchofield

@l0kod
Copy link
Member Author

l0kod commented Mar 20, 2024

Hi Ayush!

The idea is to replace kzalloc() calls with kmem_cache_zalloc() and update the related dependencies. You should find plenty of examples in the kernel code (e.g. lsm_file_cache).

However, struct landlock_ruleset would not work because it contains a flexible array member (FAM). With a separate patch series, we could leverage the upcoming kmem_buckets_alloc().

@l0kod
Copy link
Member Author

l0kod commented Mar 20, 2024

I'm assigning this issue to you. Please regularly update it with your progress. This doesn't need to be long but just enough to make sure you are not blocked.

@ayush-0110
Copy link

Thanks a lot @l0kod . Will definitely keep updating progress and seek your guidance if anywhere I feel stuck.

@ayush-0110
Copy link

Hello @l0kod . I have gone through the lsm_file_cache code for reference and tried to use kmem_cache_zalloc( ) for struct landlock_object. Let me share the steps taken, kindly correct me wherever I am wrong:

  1. created a struct of type kmem_cache to create cache.
  2. created one initialization function to create cache to allocate memory from.
  3. replaced zalloc calls with kmem_cache_zalloc( ).

the problem I am facing is: I am not getting where to call the initialisation function to ensure cache is created only once. could you help me please? @l0kod @AlisonSchofield

@l0kod
Copy link
Member Author

l0kod commented Mar 26, 2024

Hello @l0kod . I have gone through the lsm_file_cache code for reference and tried to use kmem_cache_zalloc( ) for struct landlock_object. Let me share the steps taken, kindly correct me wherever I am wrong:

  1. created a struct of type kmem_cache to create cache.
  2. created one initialization function to create cache to allocate memory from.
  3. replaced zalloc calls with kmem_cache_zalloc( ).

This looks good.

the problem I am facing is: I am not getting where to call the initialisation function to ensure cache is created only once. could you help me please? @l0kod @AlisonSchofield

You can call the initialization function from landlock_init().

@ayush-0110
Copy link

Okay. Thanks @l0kod . I am sending the changes for your review. Kindly look into them and instruct me further. Thanks.

@l0kod
Copy link
Member Author

l0kod commented Mar 28, 2024

For reference: the related patch.

@ayush-0110
Copy link

Hello @l0kod.
I was trying to work on the suggestion of Paul and I am facing some problem regarding replacing kzalloc with
kmem_cache_zalloc calls when using KMEM macro from include/linux/slab.h because for
kmem_cache_zalloc I will be needing a cache pointer, but KMEM macro
doesn't return or allocate any such pointer. So is there any way to do this using macro or do i have to avoid using that macro for this case and use all methods regarding kmem as defined in the linux memory management API doc?

@ayush-0110
Copy link

Hello @l0kod . I sent a V2 of the patch. Could you please look into it and suggest me if any changes have to be done? Thanks

@ayush-0110
Copy link

ayush-0110 commented Apr 1, 2024

Hello @l0kod @AlisonSchofield. I have created a complete patch aimed at solving this issue where I have used kmem to replace the kzalloc calls with kmem_cache_zalloc and also have metrics using /proc/slabinfo. Kindly look into it. Thanks
Link: the related patch

@ayush-0110
Copy link

Hello @l0kod @AlisonSchofield. Kindly look into the patch I created as today is the last date for submitting our final application.

@l0kod
Copy link
Member Author

l0kod commented Apr 2, 2024

Hello @l0kod @AlisonSchofield. Kindly look into the patch I created as today is the last date for submitting our final application.

You can use the link to the latest patch as contribution. I'll review it shortly.

@ayush-0110
Copy link

Ok @l0kod . Thanks. Now I wish to change the calls for struct landlock_ruleset as well and as you mentioned, i have to use kmem_buckets_alloc(). Can you guide me about how to proceed with it?

@l0kod
Copy link
Member Author

l0kod commented Apr 11, 2024

It's too soon to work on kmem_buckets_alloc() because of the ongoing discussion. We'll see what happen.

As suggested in the mailing list, some metrics would be useful to know the impact of kmem_cache.

@ayush-0110
Copy link

Hello @l0kod. In my recent patch, I included some metrics from /proc/slabinfo, as that was mentioned in the issue. Any other metrics also needed for the same??

@l0kod
Copy link
Member Author

l0kod commented Apr 30, 2024

Hello @l0kod. In my recent patch, I included some metrics from /proc/slabinfo, as that was mentioned in the issue. Any other metrics also needed for the same??

By metric I mean to measure the performance impact of such change. This should help: #24

@l0kod
Copy link
Member Author

l0kod commented Jun 11, 2024

I guess @ayush-0110 is not working on this task anymore. Let's close this task until we get the benchmark tools, see #24 and upstream comments.

@l0kod l0kod closed this as completed Jun 11, 2024
@github-project-automation github-project-automation bot moved this from Ready to In review in Landlock Jun 11, 2024
@l0kod l0kod moved this from In review to Backlog in Landlock Jun 11, 2024
l0kod pushed a commit that referenced this issue Dec 16, 2024
commit 98100e8 upstream.

The action force umount(umount -f) will attempt to kill all rpc_task even
umount operation may ultimately fail if some files remain open.
Consequently, if an action attempts to open a file, it can potentially
send two rpc_task to nfs server.

                   NFS CLIENT
thread1                             thread2
open("file")
...
nfs4_do_open
 _nfs4_do_open
  _nfs4_open_and_get_state
   _nfs4_proc_open
    nfs4_run_open_task
     /* rpc_task1 */
     rpc_run_task
     rpc_wait_for_completion_task

                                    umount -f
                                    nfs_umount_begin
                                     rpc_killall_tasks
                                      rpc_signal_task
     rpc_task1 been wakeup
     and return -512
 _nfs4_do_open // while loop
    ...
    nfs4_run_open_task
     /* rpc_task2 */
     rpc_run_task
     rpc_wait_for_completion_task

While processing an open request, nfsd will first attempt to find or
allocate an nfs4_openowner. If it finds an nfs4_openowner that is not
marked as NFS4_OO_CONFIRMED, this nfs4_openowner will released. Since
two rpc_task can attempt to open the same file simultaneously from the
client to server, and because two instances of nfsd can run
concurrently, this situation can lead to lots of memory leak.
Additionally, when we echo 0 to /proc/fs/nfsd/threads, warning will be
triggered.

                    NFS SERVER
nfsd1                  nfsd2       echo 0 > /proc/fs/nfsd/threads

nfsd4_open
 nfsd4_process_open1
  find_or_alloc_open_stateowner
   // alloc oo1, stateid1
                       nfsd4_open
                        nfsd4_process_open1
                        find_or_alloc_open_stateowner
                        // find oo1, without NFS4_OO_CONFIRMED
                         release_openowner
                          unhash_openowner_locked
                          list_del_init(&oo->oo_perclient)
                          // cannot find this oo
                          // from client, LEAK!!!
                         alloc_stateowner // alloc oo2

 nfsd4_process_open2
  init_open_stateid
  // associate oo1
  // with stateid1, stateid1 LEAK!!!
  nfs4_get_vfs_file
  // alloc nfsd_file1 and nfsd_file_mark1
  // all LEAK!!!

                         nfsd4_process_open2
                         ...

                                    write_threads
                                     ...
                                     nfsd_destroy_serv
                                      nfsd_shutdown_net
                                       nfs4_state_shutdown_net
                                        nfs4_state_destroy_net
                                         destroy_client
                                          __destroy_client
                                          // won't find oo1!!!
                                     nfsd_shutdown_generic
                                      nfsd_file_cache_shutdown
                                       kmem_cache_destroy
                                       for nfsd_file_slab
                                       and nfsd_file_mark_slab
                                       // bark since nfsd_file1
                                       // and nfsd_file_mark1
                                       // still alive

=======================================================================
BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on
__kmem_cache_shutdown()
-----------------------------------------------------------------------

Slab 0xffd4000004438a80 objects=34 used=1 fp=0xff11000110e2ad28
flags=0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff)
CPU: 4 UID: 0 PID: 757 Comm: sh Not tainted 6.12.0-rc6+ #19
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.16.1-2.fc37 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x53/0x70
 slab_err+0xb0/0xf0
 __kmem_cache_shutdown+0x15c/0x310
 kmem_cache_destroy+0x66/0x160
 nfsd_file_cache_shutdown+0xac/0x210 [nfsd]
 nfsd_destroy_serv+0x251/0x2a0 [nfsd]
 nfsd_svc+0x125/0x1e0 [nfsd]
 write_threads+0x16a/0x2a0 [nfsd]
 nfsctl_transaction_write+0x74/0xa0 [nfsd]
 vfs_write+0x1ae/0x6d0
 ksys_write+0xc1/0x160
 do_syscall_64+0x5f/0x170
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Disabling lock debugging due to kernel taint
Object 0xff11000110e2ac38 @offset=3128
Allocated in nfsd_file_do_acquire+0x20f/0xa30 [nfsd] age=1635 cpu=3
pid=800
 nfsd_file_do_acquire+0x20f/0xa30 [nfsd]
 nfsd_file_acquire_opened+0x5f/0x90 [nfsd]
 nfs4_get_vfs_file+0x4c9/0x570 [nfsd]
 nfsd4_process_open2+0x713/0x1070 [nfsd]
 nfsd4_open+0x74b/0x8b0 [nfsd]
 nfsd4_proc_compound+0x70b/0xc20 [nfsd]
 nfsd_dispatch+0x1b4/0x3a0 [nfsd]
 svc_process_common+0x5b8/0xc50 [sunrpc]
 svc_process+0x2ab/0x3b0 [sunrpc]
 svc_handle_xprt+0x681/0xa20 [sunrpc]
 nfsd+0x183/0x220 [nfsd]
 kthread+0x199/0x1e0
 ret_from_fork+0x31/0x60
 ret_from_fork_asm+0x1a/0x30

Add nfs4_openowner_unhashed to help found unhashed nfs4_openowner, and
break nfsd4_open process to fix this problem.

Cc: stable@vger.kernel.org # v5.4+
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Yang Erkun <yangerkun@huawei.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
l0kod pushed a commit that referenced this issue Dec 16, 2024
commit 98100e8 upstream.

The action force umount(umount -f) will attempt to kill all rpc_task even
umount operation may ultimately fail if some files remain open.
Consequently, if an action attempts to open a file, it can potentially
send two rpc_task to nfs server.

                   NFS CLIENT
thread1                             thread2
open("file")
...
nfs4_do_open
 _nfs4_do_open
  _nfs4_open_and_get_state
   _nfs4_proc_open
    nfs4_run_open_task
     /* rpc_task1 */
     rpc_run_task
     rpc_wait_for_completion_task

                                    umount -f
                                    nfs_umount_begin
                                     rpc_killall_tasks
                                      rpc_signal_task
     rpc_task1 been wakeup
     and return -512
 _nfs4_do_open // while loop
    ...
    nfs4_run_open_task
     /* rpc_task2 */
     rpc_run_task
     rpc_wait_for_completion_task

While processing an open request, nfsd will first attempt to find or
allocate an nfs4_openowner. If it finds an nfs4_openowner that is not
marked as NFS4_OO_CONFIRMED, this nfs4_openowner will released. Since
two rpc_task can attempt to open the same file simultaneously from the
client to server, and because two instances of nfsd can run
concurrently, this situation can lead to lots of memory leak.
Additionally, when we echo 0 to /proc/fs/nfsd/threads, warning will be
triggered.

                    NFS SERVER
nfsd1                  nfsd2       echo 0 > /proc/fs/nfsd/threads

nfsd4_open
 nfsd4_process_open1
  find_or_alloc_open_stateowner
   // alloc oo1, stateid1
                       nfsd4_open
                        nfsd4_process_open1
                        find_or_alloc_open_stateowner
                        // find oo1, without NFS4_OO_CONFIRMED
                         release_openowner
                          unhash_openowner_locked
                          list_del_init(&oo->oo_perclient)
                          // cannot find this oo
                          // from client, LEAK!!!
                         alloc_stateowner // alloc oo2

 nfsd4_process_open2
  init_open_stateid
  // associate oo1
  // with stateid1, stateid1 LEAK!!!
  nfs4_get_vfs_file
  // alloc nfsd_file1 and nfsd_file_mark1
  // all LEAK!!!

                         nfsd4_process_open2
                         ...

                                    write_threads
                                     ...
                                     nfsd_destroy_serv
                                      nfsd_shutdown_net
                                       nfs4_state_shutdown_net
                                        nfs4_state_destroy_net
                                         destroy_client
                                          __destroy_client
                                          // won't find oo1!!!
                                     nfsd_shutdown_generic
                                      nfsd_file_cache_shutdown
                                       kmem_cache_destroy
                                       for nfsd_file_slab
                                       and nfsd_file_mark_slab
                                       // bark since nfsd_file1
                                       // and nfsd_file_mark1
                                       // still alive

=======================================================================
BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on
__kmem_cache_shutdown()
-----------------------------------------------------------------------

Slab 0xffd4000004438a80 objects=34 used=1 fp=0xff11000110e2ad28
flags=0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff)
CPU: 4 UID: 0 PID: 757 Comm: sh Not tainted 6.12.0-rc6+ #19
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.16.1-2.fc37 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x53/0x70
 slab_err+0xb0/0xf0
 __kmem_cache_shutdown+0x15c/0x310
 kmem_cache_destroy+0x66/0x160
 nfsd_file_cache_shutdown+0xac/0x210 [nfsd]
 nfsd_destroy_serv+0x251/0x2a0 [nfsd]
 nfsd_svc+0x125/0x1e0 [nfsd]
 write_threads+0x16a/0x2a0 [nfsd]
 nfsctl_transaction_write+0x74/0xa0 [nfsd]
 vfs_write+0x1ae/0x6d0
 ksys_write+0xc1/0x160
 do_syscall_64+0x5f/0x170
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Disabling lock debugging due to kernel taint
Object 0xff11000110e2ac38 @offset=3128
Allocated in nfsd_file_do_acquire+0x20f/0xa30 [nfsd] age=1635 cpu=3
pid=800
 nfsd_file_do_acquire+0x20f/0xa30 [nfsd]
 nfsd_file_acquire_opened+0x5f/0x90 [nfsd]
 nfs4_get_vfs_file+0x4c9/0x570 [nfsd]
 nfsd4_process_open2+0x713/0x1070 [nfsd]
 nfsd4_open+0x74b/0x8b0 [nfsd]
 nfsd4_proc_compound+0x70b/0xc20 [nfsd]
 nfsd_dispatch+0x1b4/0x3a0 [nfsd]
 svc_process_common+0x5b8/0xc50 [sunrpc]
 svc_process+0x2ab/0x3b0 [sunrpc]
 svc_handle_xprt+0x681/0xa20 [sunrpc]
 nfsd+0x183/0x220 [nfsd]
 kthread+0x199/0x1e0
 ret_from_fork+0x31/0x60
 ret_from_fork_asm+0x1a/0x30

Add nfs4_openowner_unhashed to help found unhashed nfs4_openowner, and
break nfsd4_open process to fix this problem.

Cc: stable@vger.kernel.org # v5.4+
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Yang Erkun <yangerkun@huawei.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
Status: Backlog
Development

No branches or pull requests

2 participants