wind77
20 January 2025 00:06
1
Hi all.
I’m cracking my head, trying to extract info using Google Sheets.
The situation:
In the cell, the data is
<media ref="/ABC/XYZ123/QWE123/ASD/AAA BBB.123.CCC-DDD.mkv">
ABC
, XYZ
, QWE
, ASD
, AAA
, BBB
, CCC
, DDD
are random words.
123
are random numbers as well.
I want to retrieve AAA BBB.123.CCC-DDD.mkv
in another cell.
I tried to use regextract
, but other than the /
and .mkv
, I cannot find a constant to leverage on.
Any suggestion?
cscs
20 January 2025 00:22
2
I’m not sure how you mean … as in are we working with the sheet itself or just the data?
$ printf '<media ref="/ABC/XYZ123/QWE123/ASD/AAA BBB.123.CCC-DDD.mkv">' | sed -e 's:.*/::' -e 's/..$//'
AAA BBB.123.CCC-DDD.mkv
wind77
20 January 2025 00:29
3
Hmm…
Say, cell A1 in Google Sheet, the data is:
<media ref="/ABC/XYZ123/QWE123/ASD/AAA BBB.123.CCC-DDD.mkv">
Now, I want cell A2 in Google Sheet, to “extract ” the content from A1.
The needed content is:
AAA BBB.123.CCC-DDD.mkv
The problem is:
I tried to use regextract
formula, but doesn’t seem like a fit here.
So, I’m hoping someone would have an idea how to do it.
search for the split() function in the documentation
=regexextract(A1,"[^/]+ \S+\.mkv")
wind77
20 January 2025 15:31
6
After testing it,
This works for:
<media ref="/ABC/XYZ123/QWE123/ASD/AAA BBB.123.CCC-DDD.mkv">
But for:
<media ref="/ABC/XYZ123/QWE123/ASD/AAA.BBB.123.CCC-DDD.mkv">
It doesn’t work due to no white space present.
So the whole line was extracted in A2:
<media ref="/ABC/XYZ123/QWE123/ASD/AAA.BBB.123.CCC-DDD.mkv">
Additional thought:
instead of using .mkv
as end-point, how should I use ">
as end-point instead?
This way, I can include other type of file extensions.
I have no expertise with either regular expressions or Google Sheets, so I am sure that someone else can come up with a more elegant solution, but this seems to work:
=regexextract(A1,"/([^/]+)"">")
Note that your question is probably better suited for a forum specific to Google Sheets or spreadsheets in general.
3 Likes
wind77
20 January 2025 17:44
8
Beautiful!
Elegant approach!
1 Like
regex should be way easier than learning a natural foreign language,
but learning and grasping regular expressions is still very, very hard (for me)
… too much rigorous logic for me
I’ll keep trying to understand that one.
wind77
20 January 2025 18:17
10
His approach is to include everything between /
and ">
.
But the use of ^/
means the content must not contain /
, which effectively pinpoint the string from the last /
to ">
, instead of any /
in the string.
Simple and elegant.
Thank you for elaborating on what apparently works!
I’ll keep trying to understand.
It’s much easier for me when having some actual examples rather than some abstractions.
No worries - I was just trying to follow.
I’ll get it eventually.
1 Like
system
Closed
23 January 2025 18:42
12
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.