Hi,I'm setup follow this guide to convert excel to csv then ingest to data lakehttps://developer.infor.com/tutorials/data-management/ingest-excel-files-into-the-data-lake
I have download and import libaryOpenyxl
files.pythonhosted.org/.../openpyxl-3.1.5-py2.py3-none-any.whl
and
pandas
https://files.pythonhosted.org/packages/d0/e1/64f9c1fccd5eebdf177e917e5499b6da266c409b6eba75b93a8cd3b8ccee/pandas-1.1.5-cp39-cp39-manylinux1_x86_64.whl
But in ION Script, when i test, it is error (there is no detail error message).Can you advice please?
Have you tried Pandas 2.2.2 instead of 1.1.5? Also, when I did this project, it required me to add the dependency libraries: numpy and pytz. The file that came into the script was an xml, so I also had et_xmlfile. It should give an error detailing dependencies. Did you test from within the script itself?https://files.pythonhosted.org/packages/bb/30/f6f1f1ac36250f50c421b1b6af08c35e5a8b5a84385ef928625336b93e6f/pandas-2.2.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Thanks Brandon,
I have imported you pandas 2.2.2 successfully. And now I when I Test within the Script, I get errorscripting.Traceback (most recent call last): File "script", line 1, in <module> File "/var/task/pandas/__init__.py", line 19, in <module> raise ImportError( ImportError: Unable to import required dependencies: numpy: No module named 'numpy' pytz: No module named 'pytz'Could you share the source download of other librabies please? (numpy, openpyxl, pytz,...)
Below are the pyt, numpy, and openpyxl files that should work. You sometimes have to go to version history to download older versions that support python 3.9. https://files.pythonhosted.org/packages/9c/3d/a121f284241f08268b21359bd425f7d4825cffc5ac5cd0e1b3d82ffd2b10/pytz-2024.1-py2.py3-none-any.whlhttps://files.pythonhosted.org/packages/b1/e3/24d289c5a3255bf52824bd52295e9a7923cad8ae5ec29539fc971e1122f6/numpy-2.0.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whlfiles.pythonhosted.org/.../openpyxl-3.1.5-py2.py3-none-any.whl
Thank you very much for your sharing. I have add them and test. But it still get error. can you advice any idea?
scripting.Traceback (most recent call last): File "/var/lang/lib/python3.9/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 1030, in _gcd_import File "<frozen importlib._bootstrap>", line 1007, in _find_and_load File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 680, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 850, in exec_module File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed File "/var/task/openpyxl/__init__.py", line 7, in <module> from openpyxl.workbook import Workbook File "/var/task/openpyxl/workbook/__init__.py", line 4, in <module> from .workbook import Workbook File "/var/task/openpyxl/workbook/workbook.py", line 7, in <module> from openpyxl.worksheet.worksheet import Worksheet File "/var/task/openpyxl/worksheet/worksheet.py", line 24, in <module> from openpyxl.cell import Cell, MergedCell File "/var/task/openpyxl/cell/__init__.py", line 3, in <module> from .cell import Cell, WriteOnlyCell, MergedCell File "/var/task/openpyxl/cell/cell.py", line 26, in <module> from openpyxl.styles import numbers, is_date_format File "/var/task/openpyxl/styles/__init__.py", line 4, in <module> from .alignment import Alignment File "/var/task/openpyxl/styles/alignment.py", line 5, in <module> from openpyxl.descriptors import Bool, MinMax, Min, Alias, NoneSet File "/var/task/openpyxl/descriptors/__init__.py", line 4, in <module> from .sequence import Sequence File "/var/task/openpyxl/descriptors/sequence.py", line 4, in <module> from openpyxl.xml.functions import Element File "/var/task/openpyxl/xml/functions.py", line 36, in <module> from et_xmlfile import xmlfile ModuleNotFoundError: No module named 'et_xmlfile' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/var/task/lambda.py", line 172, in _execute exec(cc, namespace, namespace) File "script", line 2, in <module> File "/var/task/pandas/io/excel/_base.py", line 495, in read_excel io = ExcelFile( File "/var/task/pandas/io/excel/_base.py", line 1567, in __init__ self._reader = self._engines[engine]( File "/var/task/pandas/io/excel/_openpyxl.py", line 552, in __init__ import_optional_dependency("openpyxl") File "/var/task/pandas/compat/_optional.py", line 138, in import_optional_dependency raise ImportError(msg) ImportError: Missing optional dependency 'openpyxl'. Use pip or conda to install openpyxl.