Seg Fault my fault?


Results 1 to 5 of 5

Thread: Seg Fault my fault?

  1. #1
    Join Date
    Jan 2001
    Location
    Somewhere in middle America
    Posts
    164

    Seg Fault my fault?

    Well... I'm sure I'm causing it, but I'm not sure that it is because of my code.

    I am working on a program that does a lot of memory allocation and mucking around with pointers to pointers to structures of pointers....etc.
    You get the point.

    I have been running the code through it's paces trying to find potential problems.
    I can cause the code to generate a segmentation fault, but I'm not sure if the segmentation fault is due to a problem with the code itself or with the means by which I trigger the fault.

    The system I am testing my program on is a Red Hat 7.3 (2.4.18-3) with 128MB ram.

    I launched 5 instances of my program it fed them each enough input to cause them to use well over 256MB ram each. At the same time I had a perl script running that I wrote to also eat as much memory as it could. This whole thing resulted in constant thrashing as apparent by the unending harddrive access.

    It was after an hour or more of this activity that one or more of my processes will terminate with a Seg Fault. I had logging in place to track if memory allocation had failed within my program and it hadn't. Several times while testing this way, my perl script also suffered a Seg Fault. Yet another time the system halted when it was thrown into kernel panic.

    The only thing I have been able to tell from the debugging I have been doing is that it seems my program dies shortly after function calls. Near the beginning of the called function (not in the same spot in the code) a seg fault will be triggered.

    Every time I run the program it is working on the exact same input data and there is no randomness to how it processes the data, yet the Seg Faults occur at different times.

    After adding an additional 256MB of ram to my test system I haven’t been able to cause the Seg Faults. I even tried setting ulimit -m (and -v) to 4096. This causes my memory allocations to start failing, but the code handles it nicely.

    My question:

    Should I just chalk it up to the constant page faults eventually leading to some amount of memory corruption that ends up causing a seg fault in whichever process encounters it first?

    I don't want to sound like one of those programmers that thinks everything is someone else's bug, but debugging my code seems to be getting me nowhere especially when it the errors are not even isolated within my program.

    Thanks
    My Machine:
    Maytag SAV5905
    710 rpm Stainless Steel Drum
    Dual boot: Gentoo / Tide

  2. #2
    Join Date
    Jun 2000
    Location
    Sweden
    Posts
    80
    Since you didn't post any code exhibiting the problem (I know that can be hard to extract from complex programs), one can only guess at the cause. But if malloc() starts failing to allocate memory after while (you are checking if malloc() returns NULL I hope), this can mean that your program simply needs more and more data as it executes *or* you are leaking memory somewhere. It can also be caused by you tampering with memory which isn't yours to tamper with, resulting in undefined behaviour, possibly a segmentation fault.

  3. #3
    Join Date
    Jan 2001
    Location
    Somewhere in middle America
    Posts
    164
    The amount of memory my process is using is valid. The purpose of my program (actually a library I am testing) is to store data in an in-memory database kind of structure.
    I am checking the returns from malloc and the only time they fail is if I use ulimit to set the maximum memory to something other than 'unlimited' and exceed that limit. Even in the case of the mallocs failing my code recovers gracefully.

    My main question is: could it not be my code that is causing the problems, but the shear brutality of my testing that is causing failures in my program and any other program that happens to be running during the testing?

    Thanks
    My Machine:
    Maytag SAV5905
    710 rpm Stainless Steel Drum
    Dual boot: Gentoo / Tide

  4. #4
    Join Date
    Jun 2000
    Location
    Houston, TX, USA
    Posts
    1,290
    A possible explanation is that your code is overflowing its stack. That would correlate nicely with the 'after function calls' timing - if the stack is overflowed, you could get a seg fault when the program tries to read the return address.
    "I'm not closed-minded, you're just wrong" - Bucky Katt

  5. #5
    Join Date
    Apr 2001
    Location
    SF Bay Area, CA
    Posts
    14,936
    Originally posted by Stuka
    if the stack is overflowed, you could get a seg fault when the program tries to read the return address.
    Or when the program tries to create function-local variables (which live on the stack).

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •