How do I automate Smartsheet API "get_sheet" function to go through a list of sheets and save data?

J.Greene
J.Greene ✭✭
edited 11/09/22 in API & Developers

Hi,


I'm currently creating a catalogue of all sheets in a workspace using the "get_sheet" function.


The data I'm collecting (crossSheetReferences, cellLinkIn, LinksOut) can only be found within the "get_sheet" function.


I have a list of sheet ID's (sheet_list) that I use for collecting data.


My code (python) looks like this:

smart = smartsheet.Smartsheet(key)

response = smart.Sheets.get_sheet(sheet_list[1], include = "crossSheetReferences")

#1 being position of sheet ID in sheet_list.



If I try and do a for loop like so:

for i in sheet_list:

response = smart.Sheets.get_sheet(i, include = "crossSheetReferences")


It takes 30mins to finish execution, but it only saves on a single smartsheet object. I can check and see that I viewed every page via API on the smartsheet activity log, but it doesn't seem to save properly to "response". I only get cell data from one sheet.


Any help with this or a workaround would be appreciated. I could go through get_sheet individually for every sheet, but my catalogue contains thousands of sheets across several workspaces, so it would be fairly tedious. Thanks

Best Answer

  • bsikes
    bsikes ✭✭✭✭
    Answer ✓

    I'm not familiar with Python, but if that's your code I would guess that your saving over the response variable each time you go through your loop. Meaning the last sheet that you get_sheet on, is the only one in the response variable.

    If that's the case, I think you would need to add the response variable to some kind of a collection/array after each call.

Answers

  • bsikes
    bsikes ✭✭✭✭
    Answer ✓

    I'm not familiar with Python, but if that's your code I would guess that your saving over the response variable each time you go through your loop. Meaning the last sheet that you get_sheet on, is the only one in the response variable.

    If that's the case, I think you would need to add the response variable to some kind of a collection/array after each call.

  • Thanks for the input. I figured it out with the following:

    responsefull = []

    for i in sheet_list:

    response = smart.Sheets.get_sheet(i, include = "crossSheetReferences")

    responsefull.append(response)


    Now, I have individal smartsheet objects saving correctly instead of into one, such as:

    <smartsheet.models.sheet.Sheet at 0AaAaAaAa0>

    However, I'm trying to parse this as json and save to a pandas dataframe for analysis.

    My code for that is as follows:

    strresponsefull = [ ]

    for i in responsefull:

    strresponse = str(i)

    strresponsefull .append(strresponse)

    strresponsefull = str(strresponsefull)

    strresponsefull = strresponsefull.lstrip("[\'")

    strresponsefull = strresponsefull.rstrip("\']")

    strresponsefull = json.dumps(strresponsefull ,skipkeys = False)


    y = json.loads(strresponsefull)

    My issue is: this last function doesn't load in JSON format correctly. I think it's because I use [ ] to store as a list, then try and reformat it with lstrip. When I previously did this with just one sheet, I didn't need to store it in a list for it to be pulled into a dataframe. In this case, y is a str, but i need it as a dict. converting y to dict at this stage doesn't work as it returns a blank df with the following code:

    df = pd.DataFrame.from_dict(pd.json_normalize(y), orient='columns')


    The goal is to use this along with finding tuples for "crossSheetReferences" as well as "linkInFromCell" and "linksOutToCells" with this:

    data = y


    tree_obj = objectpath.Tree(data)


    crossSheets = tuple(tree_obj.execute('$..crossSheetReferences'))


    The above code works, but only if y is one sheet, not several. y in this case needs to be a dictionary. I will paste below the full working code for a single sheet:

    response = smart.Sheets.get_sheet(sheet_list[1], include = "crossSheetReferences")

    strresponse = str(response)

    y = json.loads(strresponse)

    data = y


    tree_obj = objectpath.Tree(data)


    crossSheets = tuple(tree_obj.execute('$..crossSheetReferences'))

    #prints out all crossReferences for this sheet

    df = pd.DataFrame.from_dict(pd.json_normalize(y), orient='columns')

    #then add crossSheets to df

    Any help would be appreciated. Thanks