Project

General

Profile

Actions

Bug #65778

open

qa: valgrind error: Leak_StillReachable malloc malloc strdup

Added by Milind Changire 15 days ago. Updated 8 days ago.

Status:
New
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2024-04-30T14:36:44.620 INFO:tasks.ceph.mds.a.smithi081.stderr:2024-04-30T14:36:44.605+0000 f85f640 -1 mds.a *** got signal Terminated ***
2024-04-30T14:36:50.304 INFO:tasks.daemonwatchdog.daemon_watchdog:daemon ceph.mds.a is failed for ~0s
2024-04-30T14:36:50.305 INFO:tasks.daemonwatchdog.daemon_watchdog:daemon ceph.mds.c is failed for ~7s
2024-04-30T14:36:50.305 INFO:tasks.daemonwatchdog.daemon_watchdog:daemon ceph.mds.e is failed for ~7s
2024-04-30T14:36:50.305 INFO:tasks.daemonwatchdog.daemon_watchdog:daemon ceph.mds.b is failed for ~7s
2024-04-30T14:36:50.701 INFO:tasks.ceph.mds.a:Stopped
2024-04-30T14:36:50.702 DEBUG:tasks.ceph.mds.c:waiting for process to exit
2024-04-30T14:36:50.702 INFO:teuthology.orchestra.run:waiting for 300
2024-04-30T14:36:50.702 DEBUG:teuthology.orchestra.run:got remote process result: 42
2024-04-30T14:36:50.702 ERROR:teuthology.orchestra.daemon.state:Error while waiting for process to exit
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_teuthology_9481b1d62f50e7d0a4f3dd83adf6945b08d5ff17/teuthology/orchestra/daemon/state.py", line 139, in stop
    run.wait([self.proc], timeout=timeout)
  File "/home/teuthworker/src/git.ceph.com_teuthology_9481b1d62f50e7d0a4f3dd83adf6945b08d5ff17/teuthology/orchestra/run.py", line 479, in wait
    proc.wait()
  File "/home/teuthworker/src/git.ceph.com_teuthology_9481b1d62f50e7d0a4f3dd83adf6945b08d5ff17/teuthology/orchestra/run.py", line 161, in wait
    self._raise_for_status()
  File "/home/teuthworker/src/git.ceph.com_teuthology_9481b1d62f50e7d0a4f3dd83adf6945b08d5ff17/teuthology/orchestra/run.py", line 181, in _raise_for_status
    raise CommandFailedError(
teuthology.exceptions.CommandFailedError: Command failed on smithi081 with status 42: "cd /home/ubuntu/cephtest && sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper term env 'OPENSSL_ia32cap=~0x1000000000000000' valgrind --trace-children=no --child-silent-after-fork=yes '--soname-synonyms=somalloc=*tcmalloc*' --num-callers=50 --suppressions=/home/ubuntu/cephtest/valgrind.supp --xml=yes --xml-file=/var/log/ceph/valgrind/mds.c.log --time-stamp=yes --vgdb=yes --exit-on-first-error=yes --error-exitcode=42 --tool=memcheck ceph-mds -f --cluster ceph -i c" 

Teuthology Job

Actions #1

Updated by Kotresh Hiremath Ravishankar 12 days ago

  • Assignee set to Milind Changire
Actions #2

Updated by Milind Changire 10 days ago

@Venky Shankar How do I interpret this ?
I don't see any frame pointing to a ceph file apart from the initial finger pointing to ceph-mds.

<?xml version="1.0"?>

<valgrindoutput>

<protocolversion>4</protocolversion>
<protocoltool>memcheck</protocoltool>

<preamble>
  <line>Memcheck, a memory error detector</line>
  <line>Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.</line>
  <line>Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info</line>
  <line>Command: ceph-mds -f --cluster ceph -i b</line>
</preamble>

<pid>68459</pid>
<ppid>68448</ppid>
<tool>memcheck</tool>

<args>
  <vargv>
    <exe>/usr/bin/valgrind</exe>
    <arg>--trace-children=no</arg>
    <arg>--child-silent-after-fork=yes</arg>
    <arg>--soname-synonyms=somalloc=*tcmalloc*</arg>
    <arg>--num-callers=50</arg>
    <arg>--suppressions=/home/ubuntu/cephtest/valgrind.supp</arg>
    <arg>--xml=yes</arg>
    <arg>--xml-file=/var/log/ceph/valgrind/mds.b.log</arg>
    <arg>--time-stamp=yes</arg>
    <arg>--vgdb=yes</arg>
    <arg>--exit-on-first-error=yes</arg>
    <arg>--error-exitcode=42</arg>
    <arg>--tool=memcheck</arg>
  </vargv>
  <argv>
    <exe>ceph-mds</exe>
    <arg>-f</arg>
    <arg>--cluster</arg>
    <arg>ceph</arg>
    <arg>-i</arg>
    <arg>b</arg>
  </argv>
</args>

<status>
  <state>RUNNING</state>
  <time>00:00:00:00.060 </time>
</status>

<status>
  <state>FINISHED</state>
  <time>00:00:33:36.844 </time>
</status>

<error>
  <unique>0x3137</unique>
  <tid>1</tid>
  <threadname>ceph-mds</threadname>
  <kind>Leak_PossiblyLost</kind>
  <xwhat>
    <text>32 bytes in 1 blocks are possibly lost in loss record 1,308 of 3,147</text>
    <leakedbytes>32</leakedbytes>
    <leakedblocks>1</leakedblocks>
  </xwhat>
  <stack>
    <frame>
      <ip>0x484BF70</ip>
      <obj>/usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so</obj>
      <fn>calloc</fn>
      <dir>/builddir/build/BUILD/valgrind-3.22.0/coregrind/m_replacemalloc</dir>
      <file>vg_replace_malloc.c</file>
      <line>1595</line>
    </frame>
    <frame>
      <ip>0x62BD6A5</ip>
      <obj>/usr/lib64/libnl-3.so.200.26.0</obj>
      <fn>__trans_list_add</fn>
    </frame>
    <frame>
      <ip>0x6232F1E</ip>
      <obj>/usr/lib64/libnl-route-3.so.200.26.0</obj>
    </frame>
    <frame>
      <ip>0x400507D</ip>
      <obj>/usr/lib64/ld-linux-x86-64.so.2</obj>
      <fn>call_init</fn>
      <dir>/usr/src/debug/glibc-2.34-105.el9.x86_64/elf</dir>
      <file>dl-init.c</file>
      <line>70</line>
    </frame>
    <frame>
      <ip>0x400507D</ip>
      <obj>/usr/lib64/ld-linux-x86-64.so.2</obj>
      <fn>call_init</fn>
      <dir>/usr/src/debug/glibc-2.34-105.el9.x86_64/elf</dir>
      <file>dl-init.c</file>
      <line>26</line>
    </frame>
    <frame>
      <ip>0x400516B</ip>
      <obj>/usr/lib64/ld-linux-x86-64.so.2</obj>
      <fn>_dl_init</fn>
      <dir>/usr/src/debug/glibc-2.34-105.el9.x86_64/elf</dir>
      <file>dl-init.c</file>
      <line>117</line>
    </frame>
    <frame>
      <ip>0x401CC29</ip>
      <obj>/usr/lib64/ld-linux-x86-64.so.2</obj>
    </frame>
    <frame>
      <ip>0x5</ip>
    </frame>
    <frame>
      <ip>0x1FFF000C1E</ip>
    </frame>
    <frame>
      <ip>0x1FFF000C27</ip>
    </frame>
    <frame>
      <ip>0x1FFF000C2A</ip>
    </frame>
    <frame>
      <ip>0x1FFF000C34</ip>
    </frame>
    <frame>
      <ip>0x1FFF000C39</ip>
    </frame>
    <frame>
      <ip>0x1FFF000C3C</ip>
    </frame>
  </stack>
</error>

</valgrindoutput>
Actions #3

Updated by Venky Shankar 8 days ago

Milind Changire wrote in #note-2:

@Venky Shankar How do I interpret this ?
I don't see any frame pointing to a ceph file apart from the initial finger pointing to ceph-mds.

That's correct. The traceback doesn't seem to point to anything on the ceph-mds side apart from some malloc frame (I think that's related to tcmalloc).

We should try to suppress this warning. Otherwise, I guess the online-data PR is mostly good to merge. yay!

Actions

Also available in: Atom PDF