Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Adapec RAID not pulling all S.M.A.R.T. stats #189

Closed
mcesnik opened this issue Aug 12, 2021 · 3 comments · Fixed by #308
Closed

Issue with Adapec RAID not pulling all S.M.A.R.T. stats #189

mcesnik opened this issue Aug 12, 2021 · 3 comments · Fixed by #308

Comments

@mcesnik
Copy link

mcesnik commented Aug 12, 2021

So this isn't so much a problem with smartctl as it is with Adaptec. I am creating this issue instead of a PR because the solution is a bit of a hack and I just want to provide insight and leave it at that.

NB - Since this solution uses Python (as I do not know Go) you will need to create your own Dockerfile.

First I need to create a wrapper to smartctl that would go into /scrutiny/bin. You also need the arcconf binary to be in the /scrutiny/bin.

#!/usr/bin/python3

import os
import sys
import json
import xml.etree.ElementTree as ET

arcconf = "/scrutiny/bin/arcconf getsmartstats 1"
smartctl = "/usr/sbin/smartctl"

smart_stats = {
    1: "Raw_Read_Error_Rate",
    3: "Spin_Up_Time",
    4: "Start_Stop_Count",
    5: "Reallocated_Sector_Ct",
    7: "Seek_Error_Rate",
    9: "Power_On_Hours",
    10: "Spin_Retry_Count",
    11: "Calibration_Retry_Count",
    12: "Power_Cycle_Count",
    192: "Power-Off_Retract_Count",
    193: "Load_Cycle_Count",
    194: "Temperature_Celsius",
    196: "Reallocated_Event_Count",
    197: "Current_Pending_Sector",
    198: "Offline_Uncorrectable",
    199: "UDMA_CRC_Error_Count",
    200: "Multi_Zone_Error_Rate"
}

if __name__ == "__main__":
    # Run smartctl and load the Json into memory
    cmd = "%s %s" % (smartctl, " ".join(sys.argv[1:]))
    output = json.loads("".join( [ i.strip() for i in os.popen(cmd).readlines() ] ))

    # Check to see if this device is actually for aacraid and do this 'extra' work if so
    aacraid = [arg for arg in sys.argv if arg.startswith("aac")]

    if any(aacraid):
        # Get the device id 
        id = int(aacraid[0].split(",")[3])

        # Call arcconf getsmartstats for controller 1 and then get the XML we care about...
        stats = [dev for dev in ET.fromstring("".join([i.strip() for i in os.popen(arcconf).readlines() if i[0] == "<"][0:-2])) if int(dev.attrib["id"]) == id ][0]

        # Now get the attributes for the device we extracted
        attributes = [ (int(stat.attrib["id"], 16), stat.attrib["normalizedCurrent"], stat.attrib["normalizedWorst"], stat.attrib["rawValue"]) for stat in stats ]

        # Write the ata_smart_attributes (I have no idea if these matter for a SCSI or not)
        output["ata_smart_attributes"] = {
          "revision": 16,
          "table": [ {
                      "id": attribute[0],
                      "name": smart_stats[attribute[0]],
                      "current": int(attribute[1]),
                      "worst": int(attribute[2]),
                      "raw": { "value": int(attribute[3]), "string": attribute[3] }
                     } for attribute in attributes if attribute[0] in smart_stats ]
        }

        # And write these additional properties too (scrutiny uses these for sure)
        output["temperature"] = { "current": [ int(a[3]) for a in attributes if a[0] == 194 ][0] }
        output["power_on_time"] = { "hours": [ int(a[3]) for a in attributes if a[0] == 9 ][0] }
        output["power_cycle_count"] = [ int(a[3]) for a in attributes if a[0] == 12 ][0]

    # Dump the whole thing to stdout just like smartctl
    print(json.dumps(output, indent=4))

For reference, this my Dockerfile

FROM analogj/scrutiny

EXPOSE 8080

RUN apt -y update && apt -y upgrade
RUN apt -y install tzdata python3

COPY arcconf /scrutiny/bin
RUN chmod 755 /scrutiny/bin/arcconf

COPY smartctl /scrutiny/bin
RUN chmod 755 /scrutiny/bin/smartctl

After building a new image using docker build you can create a container as per the standard documentation but before you do add the following to a collector.yaml and include that in the /scrutiny/config as a volume mount.

devices:
  - device: /dev/sd-
    type:
      - aacraid,0,0,0
      - aacraid,0,0,1
      - ... the rest of your drives here

I hope this helps someone save some time if they are using some legacy Adaptec cards like me :-)

@mcesnik
Copy link
Author

mcesnik commented Aug 12, 2021

Also, I noticed that although $TZ is supposed to be supported the tzdata package is not installed.

@AnalogJ
Copy link
Owner

AnalogJ commented Jun 25, 2022

@mcesnik Appreciate your patience and thanks for the helpful python script.

The beta branch (and beta-* images) support a new collector config file option: commands.metrics_smartctl_bin
which can be use to customize the smartctl binary path.

I'll also add this to the documentation section.

Thanks again!

@AnalogJ
Copy link
Owner

AnalogJ commented Jun 25, 2022

though, I am curious why is this necessary? smartctl is supposed to support Adaptec raid devices natively. Whats the output of commands like: smartctl -a -d "aacraid,0,0,1" /dev/sda

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants