Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

openPyXL - assign value to range of cells during unmerge

So I have excel files with several sheets in each and I'm working on script which will gather data from selected sheets if they exist in file and combine it in one, big sheet. Generally it's working, iterating through files and if desired sheet exist it finds range of cells with data and append it to dataframe. The thing I need to do now is to add header row (column names) to Dataframe, but in sheet those are multiline headers.

To make it look the same in dataframe i need to unmerge cells in top header row and copy value from first cell to rest of them in range which was merged before).

I'm using OpenPyXL for accessing excel sheets. My function receives sheet to work on as only parameter. It looks like this:

def checkForMergedCells(sheet):
    merged = ws.merged_cell_ranges
    for mergedCell in merged:
        mc_start, mc_stop = str(mergedCell).split(':')
        cp_value = sheet[mc_start]
        sheet.unmerge_cells(mergedCell)
        cell_range = sheet[mergedCell]
        for cell in cell_range:
            cell.value = cp_value

The thing is that cell_range returns a tuple which ends up in getting error message:

AttributeError: 'tuple' object has no attribute 'value' Below you can see screencap during debug which shows values passed in each variable.

Debugger running

like image 584
Uzzy Avatar asked Nov 01 '25 02:11

Uzzy


2 Answers

Accessing by index will generally return a tuple of tuples except if you try to get an individual cell or row. For programmatic access you should use iter_rows() or iter_cols()

You might like to spend some time looking at the utils module.

from openpyxl.utils import range_boundaries

for group in ws.merged_cell_ranges:
     min_col, min_row, max_col, max_row = range_boundaries(group)
     top_left_cell_value = ws.cell(row=min_row, column=min_col).value
     for row in ws.iter_rows(min_col=min_col, min_row=min_row, max_col=max_col, max_row=max_row):
         for cell in row:
             cell.value = top_left_cell_value
like image 177
Charlie Clark Avatar answered Nov 02 '25 17:11

Charlie Clark


Issues with other answers

Regarding @Charlie Clark's selected answer and the other answers which use the code from http://thequickblog.com/merge-unmerge-cells-openpyxl-in-python, you can unmerge the cells more easily without dealing with range_boundaries and those conversions.

I was also getting issues with the selected answer where some merged cells would unmerge but others would not, and some unmerged cells would fill with the data I wanted but others would not.

The issue is that worksheet.merged_cells.ranges is an iterator which means it is lazily evaluated, so when worksheet.unmerge_cells() is called, the object worksheet.merged_cells is mutated and side-effects occur when iterating the merged cell ranges again.

Better solution

In my case, I wanted to unmerge cells like this, copying the border, font, and alignment information as well:

                    +-------+------+
+-------+------+    | Date  | Time |
| Date  | Time |    +=======+======+
+=======+======+    | Aug 6 | 1:00 |
|       | 1:00 | -> +-------+------+
| Aug 6 | 3:00 |    | Aug 6 | 3:00 |
|       | 6:00 |    +-------+------+
+-------+------+    | Aug 6 | 6:00 |
                    +-------+------+

For the current latest version of openpyxl==3.0.9, I found that the following works best for me:

from copy import copy

from openpyxl import load_workbook, Workbook
from openpyxl.cell import Cell
from openpyxl.worksheet.cell_range import CellRange
from openpyxl.worksheet.worksheet import Worksheet


def unmerge_and_fill_cells(worksheet: Worksheet) -> None:
    """
    Unmerges all merged cells in the given ``worksheet`` and copies the content
    and styling of the original cell to the newly unmerged cells.

    :param worksheet: The Excel worksheet containing the merged cells.
    """

    # Must convert iterator to list to eagerly evaluate all merged cell ranges
    # before looping over them - this prevents unintended side-effects of
    # certain cell ranges from being skipped since `worksheet.unmerge_cells()`
    # is destructive.
    all_merged_cell_ranges: list[CellRange] = list(
        worksheet.merged_cells.ranges
    )

    for merged_cell_range in all_merged_cell_ranges:
        merged_cell: Cell = merged_cell_range.start_cell
        worksheet.unmerge_cells(range_string=merged_cell_range.coord)

        # Don't need to convert iterator to list here since `merged_cell_range`
        # is cached
        for row_index, col_index in merged_cell_range.cells:
            cell: Cell = worksheet.cell(row=row_index, column=col_index)
            cell.value = merged_cell.value

            # (Optional) If you want to also copy the original cell styling to
            # the newly unmerged cells, you must use shallow `copy()` since
            # cell style properties are proxy objects which are not hashable.
            #
            # See <https://openpyxl.rtfd.io/en/stable/styles.html#copying-styles>
            cell.alignment = copy(merged_cell.alignment)
            cell.border = copy(merged_cell.border)
            cell.font = copy(merged_cell.font)


# Sample usage
if __name__ == "__main__":
    workbook: Workbook = load_workbook(
        filename="workbook_with_merged_cells.xlsx"
    )
    worksheet: Worksheet = workbook["My Sheet"]

    unmerge_and_fill_cells(worksheet=worksheet)
    workbook.save(filename="workbook_with_unmerged_cells.xlsx")

Concise solution

Here is a shorter version without comments and not copying styles:

from openpyxl.worksheet.worksheet import Worksheet

def unmerge_and_fill_cells(worksheet: Worksheet) -> None:
    for merged_cell_range in list(worksheet.merged_cells.ranges):
        worksheet.unmerge_cells(range_string=merged_cell_range.start_cell)

        for row_col_indices in merged_cell_range.cells:
            worksheet.cell(*row_col_indices).value = merged_cell.value
like image 30
homersimpson Avatar answered Nov 02 '25 15:11

homersimpson