Welcome
Welcome to refracta

You are currently viewing our boards as a guest, which gives you limited access to view most discussions and access our other features. By joining our free community, you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content, and access many other special features. In addition, registered members also see less advertisements. Registration is fast, simple, and absolutely free, so please, join our community today!

script question

If it's not on-topic, it's in here.

script question

Postby nadir » Fri Mar 01, 2013 12:25 am

I was asked if i could help with a script. As i was asked and don't need help myself, i put this in the general nonsense section.
I gave up pretty fast, then thought that it is a good or interesting subject.

In Germany we got "gelbe Seiten". There you can look up people who do certain works (dentist, garderner, well: anything).
They got a web-site:
http://www.gelbeseiten.de/
From there the search results are needed, and what needs to be extracted from search results is:
name, address, phone, mail and www

I first thought this might be easy. But for me it isn't.
The 2 search boxes ask for:
< search term > (say gardener)
< city > (say Berlin)

The first problem i got is that i would not know how to tell wget (or whatever one uses) to "enter" the search term.
As i couldn't figure out, i just picked a search term ("Tischler", which is a carpenter) and a city (Arnsdorf), took the url:
http://www.gelbeseiten.de/tischler/arnsdorf
and simply wget it.
Then i was lost again, when trying to find a pattern for name, address, phone, etc, and gave up.

Any ideas or experiences or ever-run-into-something-which-sounded-like-that?
So i herd u liek mudkip?
User avatar
nadir
 
Posts: 1160
Joined: Wed Mar 09, 2011 4:18 am
Location: here

Re: script question

Postby fsmithred » Fri Mar 01, 2013 3:26 pm

OK, I'll start:
Code: Select all
# get the names
sed -n '/itemprop="name"/{n;p;}' arnsdorf.html | sed 's/<\/span>//g'

# get the street addresses
grep street-address arnsdorf.html | sed 's/<span\ class=\"subscriber_street_address\"\ itemprop=\"street-address\">//g' |sed 's/<\/span>,//g'
I'll leave it to you to sort the output so that the names match the street addresses.

And now, I'll give up. My gut feeling is that there's a way to do it with javascript, since it's already laid out that way. But I really have no idea what I'm talking about. Are they pulling the data from a database, or do they have to manually add each entry to the page? If from a database, you'd probably do better to access it directly. And I'll guess ahead of time that you can't do that.
User avatar
fsmithred
 
Posts: 2101
Joined: Wed Mar 09, 2011 9:13 pm

Re: script question

Postby nadir » Sat Mar 02, 2013 3:04 pm

What means {n;p;}
I can see what it does, but would not be able to change it in case i ever need it for something different.


A bit naive i tried this:

for i in $(
sed -n '/itemprop="name"/{n;p;}' arnsdorf | sed 's/<\/span>//g'
)
do
# get the street addresses
grep street-address arnsdorf | sed 's/<span\ class=\"subscriber_street_address\"\ itemprop=\"street-address\">//g' |sed 's/<\/span>,//g'
done

-
Not much better, but perhaps on the right track:

for i in $(
sed -n '/itemprop="name"/{n;p;}' arnsdorf | sed 's/<\/span>//g'
)
do
echo $i
# grep street-address arnsdorf | sed 's/<span\ class=\"subscriber_street_address\"\ itemprop=\"street-address\">//g' |sed 's/<\/span>,//g'
echo
done
So i herd u liek mudkip?
User avatar
nadir
 
Posts: 1160
Joined: Wed Mar 09, 2011 4:18 am
Location: here

Re: script question

Postby fsmithred » Sun Mar 03, 2013 1:32 am

I'm not really sure what the letters mean. I think the ;p is print, and ;n might mean next. You can also do {;n;n;p} and so on, to get more lines after the line with the pattern.
User avatar
fsmithred
 
Posts: 2101
Joined: Wed Mar 09, 2011 9:13 pm

Re: script question

Postby nadir » Sun Mar 03, 2013 4:38 am

Thanks. Really.

The bad news: sed will have to wait for me. I looked at the link you gave in the sed thread, and i could also have watched the stars for half an hour.
So i herd u liek mudkip?
User avatar
nadir
 
Posts: 1160
Joined: Wed Mar 09, 2011 4:18 am
Location: here


Return to General Nonsense

Who is online

Users browsing this forum: No registered users and 0 guests

cron
suspicion-preferred