Wikivoyage talk:Copyright-related issues

From Wikivoyage
Latest comment: 5 months ago by Yvwv in topic Walking tour copyright?
Jump to navigation Jump to search

This is spam, please revert it[edit]

A spam bot likes to spam the first section of this article, so this section has been added as a way of notifying editors that a spam contribution was made. -- (WT-en) Ryan 16:49, 17 April 2006 (EDT)

International copyright issue[edit]

I'm translating the Copyleft page into German. For me, a big juristical question has risen: What law is aplicable when a German contributor transmitts his work onto an US-American Server, German or US law? This not unimportant since there also exist German and Austrian adaptions of the Attribution-ShareAlike 1.0 Licence. To where should I link now??? -- (WT-en) Hansm 09:05, 2004 Sep 13 (EDT)[edit]

Swept in from the Project:Travelers' pub:

There is a lot of material from As there is also a lot of references to edirectory (against ext guide policy) the anon user just might be the copyrigtholder, but I doubt it. Also he renames Learn to Money, etc. --(WT-en) elgaard 10:19, 4 Jul 2005 (EDT)

That's a clear-cut case for reversion. He has to actually say he's the copyright holder in order to do this. It could always be recovered if he says he's the copyright holder. -- (WT-en) Colin 13:20, 4 Jul 2005 (EDT)

See also[edit]

In case someone thinks removing the following relevant links is helpful, please pause to think. Research into what is actually legal for us to copy is necessary. The following are valuable research leads and references. Please do not remove them from this Talk page.

  • Wikivoyage uses a Creative Commons copyright while Wikipedia uses a Gnu Free Documentation License. Nevertheless, much relevant and well presented research about copyrights may be found on Wikipedia's copyright and copyright discussion page:
  • Universal Copyright Convention as revised at Paris on 24 July 1971:
Per section
One whole (searchable) page


Swept in from the pub

The upload process for Wikivoyage has licensing options for Commons 3.0, etc. but not 4.0. This should be added so giving appropriate credit is possible. Thanks. --Comment by Selfie City (talk | contributions) 01:09, 15 February 2019 (UTC)Reply

I suppose the list just hasn't been updated, so I am adding the licence now. Could somebody who does local uploads check the change works as it should?
I do not understand the "so giving appropriate credit is possible" part. If you upload images taken by others, then there is a zillion of possible licences. Is there something in the CC-BY-SA 3.0 that hinders giving "appropriate credit"?
If there are technical issues, the talk page is MediaWiki talk:Licenses.
--LPfi (talk) 07:30, 15 February 2019 (UTC)Reply
I left the old version. I suppose the selection of licences is small by design, so having two versions of the same licence could be regarded redundant. On the other hand CC-BY-SA 3.0 is the licence used for text on WV, WP etc., and I for one have not studied the changes introduced in the new version and am thus not confident enough to use it for general licensing. --LPfi (talk) 07:35, 15 February 2019 (UTC)Reply
Thanks for adding CC4.0. The selection now implies that CC2.0 is only for Flickr images when it could be used for those based on older commons files (maybe it has always done this). It is missing a public domain licence, although if an image is public domain, then you are free to upload it using one of the CC licences. I don't think that we have much use for the more specialised licenses that commons allows. It would be good to have a page which explain licenses linked from the upload page. AlasdairW (talk) 14:25, 15 February 2019 (UTC)Reply
I just added the 4.0 ones at "Creative Commons License". I supposed the Flickr licences were Flickr specific. Having the same licences twice (ones for licence source, once for image source) makes little sense, so I merge the lists.
Licensing a PD image with a CC licence is copyright fraud. While it seems not to be criminal in USA, I think we should not do something that seems like encouraging the practice. The questions is whether {{PD}} is enough or whether we should offer several PD tags.
The question about "specialised licenses" is whether we want to support upload of images which are under some other licence. As long as the uploader is the copyright owner, it is easy to require the use of one of a small set of standard licences, but if we want to use a photo that is licensed under any other licence, we have to either do without or accept that licence, however "specialised" it is. I think "Something else" has to be offered for those cases. I will add that, and corresponding language to MediaWiki:Uploadtext.
--LPfi (talk) 14:54, 15 February 2019 (UTC)Reply
The reason is that, per "appropriate credit", the box with the information about the license is necessary. --Comment by Selfie City (talk | contributions) 15:24, 15 February 2019 (UTC)Reply
Mentioning the licence is an additional requirement to that of giving appropriate credit, a requirement of many more licences than those in the drop-down. The other licences have to be handled by inserting the template by hand, or writing name and link by hand for those licences lacking a template. But as CC-BY-SA 4.0 has become the recommended licence on Commons, it is good that we offer an easy way to use it. --LPfi (talk) 16:46, 15 February 2019 (UTC)Reply
OK. I'll try the "Upload file" now to make sure it works, but the code that I saw looked good. --Comment by Selfie City (talk | contributions) 23:39, 15 February 2019 (UTC)Reply


Is it okay to use pictures using the GNU license for banners? I've never really paid attention to it before, because beneath it in Commons are always Creative Commons licenses, but just checking. --Comment by Selfie City (talk | contributions) 01:16, 16 February 2019 (UTC)Reply

As I understand it the GFDL is inconvenient for reusers and therefore generally not allowed as the sole license for new uploads to Commons (see here). This doesn't really matter for us directly. The only reason for us to avoid the license, as far as I can tell, would be to make it easier for people to meticulously follow the licensing requirements when reusing our content in offline/printed materials. Files that are dual-licensed under GFDL and CC are absolutely fine, we (or other reusers) can just use the CC license and ignore the GFDL. —Granger (talk · contribs) 02:14, 16 February 2019 (UTC)Reply
I see. In my cases, I believe the files had both GNU and Commons licenses listed, so when uploading the cropped version of the file, I cited the Commons license. Thanks for explaining! --Comment by Selfie City (talk | contributions) 02:40, 16 February 2019 (UTC)Reply
Oh, and thanks for making all those changes to "North Macedonia". --Comment by Selfie City (talk | contributions) 02:40, 16 February 2019 (UTC)Reply

Articles vs GPT[edit]

Swept in from the pub

I'm sure most of you guys are aware of GPT by now. GPT-4 (also the one on can now summarize articles, and even do stuff like ("Find hidden gem travel spots in Maui, Hawaii") ... Isn't it time to rethink our approach to putting together region articles? Obviously it's chicken-egg problem, but still I'd say in the coming months/years it will progressively be less and less valuable spending time on stuff that can be generated (esp. since GPT will have some statisticical knowledge about the the regions, regarding the most cited POIs). Is there some "will" here to discuss major changes, or we'll do it the "good old way", until WV stops being relevant? I'd say city-level articles are mostly safe for now, but the higher level ones.......... -- andree 17:50, 7 May 2023 (UTC)Reply

Chat GPT has built-in problems with plagiarism, and also, when it's inaccurate, if we use it, who is at fault? We are not going to outsource anything to them. Ikan Kekek (talk) 17:54, 7 May 2023 (UTC)Reply
Obviously, changing WV to a GPT proxy wasn't the suggestion... Sure, in principle all the AI stuff is either hallucination or plagiarism, by definition. That doesn't change the fact that the way we access/search for (any) information will likely change dramatically. "Morality" issues aside (imho, outcome like "napster-spotify" will come out of this, after some time of trying to fight this by regulations), I'd say we should brace ourselves and consider the options.
I'm but a minor WV editor, but I'm always quite surprised how many "most interesting" places are missing compared to various IG/FB travel tips channels. So that's what I mostly added in the past years. But finding the right place/city is often hard. And finding those "gems" via breadcrumbs navigation is downright impossible. I'd like to hear some suggestions how to deal with that sort of problems. How to make WV actually useful on it's own. IMO the region articles are in very poor state globally (sans the country and maybe 1-2 levels below). Is the plan that WV will be only a source of the articles forever, and bing/gpt/google/... will do the aggregation? -- andree 19:21, 7 May 2023 (UTC)Reply
The obvious solution is that the most interesting places in each region need to be mentioned in the region articles. We should do a Collaboration of the month on making sure the most interesting places are mentioned in region articles, with a link to the local articles that have fuller listings of them (meaning that adding some listings in local articles will be part of the collaboration, too). And since I just finished grading for the semester, I'd be happy to take part in it. Let's decide on a good scope of work and phrasing for the Cotm, seek participants and get started with it. Lots of region articles have sucked forever. Ikan Kekek (talk) 19:49, 7 May 2023 (UTC)Reply
Proposed at Wikivoyage talk:Collaboration of the month#New Cotm to mention places of interest in region articles?. Ikan Kekek (talk) 21:38, 7 May 2023 (UTC)Reply
Is it possible to use ChatGPT or other AI to produce region summaries, using only the referenced city articles (and sub-regions) as sources? This could then be compared with what we have in the region article. AlasdairW (talk) 22:59, 7 May 2023 (UTC)Reply
You can try and find out, but I don't see why it's necessary. Ikan Kekek (talk) 23:21, 7 May 2023 (UTC)Reply
The use of AI for article creation is going to be one of the sub-themes for WikiConference North America submissions. OhanaUnitedTalk page 03:54, 8 May 2023 (UTC)Reply
Something like this would be my idea too. But just purely with our articles, it's not possible to figure out (neither for a human, nor AI) which POIs are the most interesting/popular/visited. Something like "1) find all POIs in the articles of the region, 2) score the POIs (e.g. by google hits, or via GPT "what are the most interesting out of these") 3) progressively fill the parent region structures with the POIs acc. to score" would be ideal. GPT could help here to summarize/reword/shorten the POIs, saving us time (at the cost of using external sources of undisclosed origin and license). I think doing all this by hand is not sustainable, unless we want to emulate monks rewriting books... But I don't have the answer how to do this, I just suggest we start some discussion on the main WV goals (I reckon WP is already aware of the paradigm shift) - what the people around there love to do the most, and think how we could do the rest more easily/optimally. IMO the lowest-level articles is where it's at, and obviously some country/top-level-region summaries. But the mechanical maintenance of regions, not so much...
TBH, as an example of the progress, I also think the languages of WV may get quite obsolete. Asking GPT to 'take ":wv:it:Roma", transform the templates and translate to English, produce a valid ":wv:en:Rome" article' will IMHO be possible within months. Perhaps it will be able even to give sources of the copied-in texts. It's the kind of tasks that GPT seems to excel at, and that instead takes us loooong time and has no added value. Do we want to use this kind of technology, or keep the old ways? A sincere question - maybe people here mostly like the current ways, which is okay... -- andree 07:21, 8 May 2023 (UTC)Reply
One of my recent concerns is the sheer capabilities of ChatGPT to provide accurate travel information within sections, which could possibly drive away readership (which isn't what this is about). As Ikan Kekek mentioned above, though, the best way to start is to not have empty region articles with absolutely nothing outside the cities/ODs list. SHB2000 (talk | contribs | meta) 06:33, 8 May 2023 (UTC)Reply
Who owns Chat GPT generated content?
“Absent human creative input, a work is not entitled to copyright protection. As a result, the U.S. Copyright Office will not register a work that was created by an autonomous artificial intelligence tool.”
Should WV have a "No Chat GPT" policy? Should there be a Chat GPT template that says "This article contains content generated by Chat GPT. View the page revision history for a list of the authors."
FWIW I see Chat GPT like like dynamic maps. Sure the static ones can be better, but the dynamic ones are much better than nothing. ButteBag (talk) 16:49, 8 May 2023 (UTC)Reply
The English Wikipedia has been talking about this for months, and I believe that they have come to the conclusion that there is no way to reliably identify whether a bit of text comes from ChatGPT or similar software. The original editor might choose to self-disclose, but if they don't, we'll never know. WhatamIdoing (talk) 19:20, 8 May 2023 (UTC)Reply
ah, ok, guess it's a non-issue then. Thanks! ButteBag (talk) 20:06, 8 May 2023 (UTC)Reply
If we can't enforce a ban, that doesn't make it a non-issue, because of the copyright implications you mentioned above. I think we should have a policy of opposing edits by chatbots. I think we have to at least do due diligence by putting people on notice, to avoid potentially being legally responsible for copyright violation as a site. How has en.wikipedia been dealing with that question? Ikan Kekek (talk) 21:13, 8 May 2023 (UTC)Reply
Agree we should oppose chatbot edits. Maybe could you please elaborate on what we're putting people on notice for? My simple and limited understanding is that WV content contributors can use text generated by chat gpt in their edits. The page I linked above seems to indicate it's not a copyvio. I have not used Chat GPT in any of my edits so far fwiw, but I would consider it if I found it helpful. ButteBag (talk) 23:14, 8 May 2023 (UTC)Reply
When you say "chatbot", do you mean a human manually typing something into a website, reading the output from that website, and then copying the results into a Wikivoyage page? Or do you mean actual Wikivoyage:Bots, which edit autonomously without a human looking over the results first? WhatamIdoing (talk) 15:24, 9 May 2023 (UTC)Reply
Speaking for myself, "chatbot" means an autonomous bot adding chat gpt text to WV without human intervention. I think it's ok for a human to use chat gpt to generate text and paste it in, although obviously they should check for errors first and I'd much prefer a knowledgeable human editor instead. Maybe this nuance already exists on a policy page somewhere? ButteBag (talk) 19:29, 9 May 2023 (UTC)Reply
How about this? "Because of the risks of copyright violation and inaccuracy from chatbots, Wikivoyage editors are not permitted to add text produced by a chatbot unless they have carefully checked it for copyright violation and inaccuracy and made any needed edits before inserting it into Wikivoyage articles. Furthermore, should their edits introduce copyright violations, users make such edits at their own sole risk in case of a copyright lawsuit by an aggrieved party and may be banned from editing if such violations are discovered." Ikan Kekek (talk) 20:33, 9 May 2023 (UTC)Reply
Otherwise I agree, but how can you carefully check there are no copyright violations? The plagiarism programs used at universities? I would rather advice rewording the facts as if they were fetched directly from a copyrighted text. –LPfi (talk) 20:43, 9 May 2023 (UTC)Reply
Using GPTZero. SHB2000 (talk | contribs | meta) 21:31, 9 May 2023 (UTC)Reply
So should we use this text? "Because of the risks of copyright violation and inaccuracy from chatbots, editors are not permitted to add text produced by a chatbot unless they reword it completely before inserting it into Wikivoyage articles - the same standard that applies to the use of information from sources that are 'copyright, all rights reserved'. Furthermore, if users' edits introduce copyright violations, they make such edits at their own sole risk in case of a copyright lawsuit by an aggrieved party and may be banned from editing if such violations are discovered." Ikan Kekek (talk) 21:41, 9 May 2023 (UTC)Reply
I'd steal much of the 3rd paragraph from Wikipedia:Large language models. The term LLM also seems more precise to me than chatbot. It's tough to enforce a 100% complete rewording always. Maybe the output is different enough in some cases, who knows. I tried the gptzero link above and it misidentified some of the generated text I pasted into it.
"LLM-generated content is often an outright fabrication or biased. LLMs must not be used in areas where the editor does not have substantial familiarity. Output text must be rigorously scrutinized for factual errors and adherence to copyright policies. Editors are fully responsible for their LLM-assisted edits. If you are not fully aware of the risks, do not edit with the assistance of these tools. Furthermore, if users' edits introduce copyright violations, they make such edits at their own sole risk in case of a copyright lawsuit by an aggrieved party and may be banned from editing if such violations are discovered." ButteBag (talk) 00:48, 10 May 2023 (UTC)Reply
Just a quick note: Bias is a Wikipedia issue, not a Wikivoyage issue, as long as it's fair. Ikan Kekek (talk) 01:06, 10 May 2023 (UTC)Reply
If the user needs to thoroughly verify the accuracy of everything (which is important because GPT's output is so full of inaccuracies) and rewrite everything (to avoid hard-to-check-for copyright violations), then what's the benefit of getting the material from the LLM? It would be simpler not to allow text from GPT at all. That would also be safer – I worry that if editors start copying GPT output into Wikivoyage, they're unlikely to check the accuracy as thoroughly as necessary, and in my experience ChatGPT has an uncanny knack for writing things that sound true but are completely made up.
The best use case I can think of would be to use GPT to generate a bullet-point list of attractions, which the editor could then write original descriptions about. But in this case both the information and the wording would come from the editor; GPT would only be helping to brainstorm, so to speak. —Granger (talk · contribs) 03:13, 10 May 2023 (UTC)Reply
Editors may be tempted to use such text, so highlighting the problems is better than just blankly forbid their use (which may be seen as pure prejudice). I agree that brainstorming help is the best approach. Usually official links to the POIs found should be included in listings, and those links can be used to check facts. The AI's role would be to list potentially interesting places and provide links or search terms for finding them. LPfi (talk) 08:06, 10 May 2023 (UTC)Reply
But this isn't exactly right. Copyright-wise, the stuff GPT creates is in principle no different from you learning about a city from a few travel books, and writing it in your words. It doesn't usually copy-paste blocks of text (unless you ask it to). Rewording such non-directly-copied text gives you the same thing, just with an extra step. Not to mention, it's almost impossible to distinguish GPT-generated text (sans the obvious mannerisms it has at the moment, which will surely be gone in a few months). Perhaps it'll really be better to wait until WP makes some "official rules" and discuss those?
But the idea behind this topic was more about the general "heading" of the project. As I said, lots of the work we are doing now may soon become useless - e.g. due to change in the way people search for stuff. So we could try to "think out of the box" and suggest stuff that looks like could be (easily) automated + would move the page forward, and start brainstorming those? -- andree 20:22, 10 May 2023 (UTC)Reply
I understand that the copyvio concerns are about whether the copyright is owned by the human who uses the LLM and posts it on wiki vs the people who created the LLM. Nobody seems to think that traditional copyvios (e.g., the LLM stealing paragraphs from other websites) are likely to be a concern. While I've not seen any authoritative answers to that question, the general sense seems to be that the output of an LLM is not eligible for copyright in the US (=is automatically and inherently in the public domain, which is fine with us). Consequently, we probably don't need to warn people against copyvios in this context.
About the "check for accuracy" idea, I think that two things would be useful:
  • You are responsible for whatever you post, regardless of whether you wrote it yourself, ran it through a grammar checker, transformed it through machine translations, or used other tools to generate the content.
  • Any bot or other software that edits pages directly must comply with the Wikivoyage:Script policy.
In other words, the rules a fundamentally the same as they've always been. WhatamIdoing (talk) 16:23, 12 May 2023 (UTC)Reply
I assume there are different AI engines that behave in different ways. If they do not quote passages from text they have encountered, then I believe there to be no copyright problems with current legislation. But anyway, yes, the user is responsible. The problem is that the temptation to trust an AI bot may be there, and you can generate text much faster with that method than by copying pieces of random reviews. –LPfi (talk) 16:36, 12 May 2023 (UTC)Reply
We could update the script policy to address the speed of editing more explicitly. It currently mentions the "pace" of editing in the lead, and later establishes a rule that "Bots without botbit should make only one edit per minute to prevent flooding the recent changes." We could expand that to include content generated through LLMs or other automated or semi-automated means. WhatamIdoing (talk) 14:52, 13 May 2023 (UTC)Reply
Here I didn't think of any speed close to the bot thresholds. Rather, using an AI bot you could upgrade a section a minute instead of each fifteen minutes with manual fact gathering. LPfi (talk) 06:29, 16 May 2023 (UTC)Reply
Two thoughts:
  • We could change that from "one edit per minute" to "one edit per minute for simple, repetitive changes, such as fixing a typo and one edit per five minutes for content creation".
  • Even if it takes 15 minutes to write something by hand, it doesn't take 15 minutes to review it. I don't remember the last time I needed to spend more than 60 seconds looking at a diff before deciding whether it needed to be reverted.
WhatamIdoing (talk) 15:23, 16 May 2023 (UTC)Reply
Hello everyone, I am an editor of Wikiviajes in Spanish. On this topic, I would like to complement a little. A few weeks ago I made comments on the Spanish Wikipedia about this, and in one of the points I discussed, the artificial intelligence gave me wrong information about a destination, I transcribe:
"Here is a list of 10 things you can do in Palmera City (I used a small city in my country but for privacy reasons, I do not disclose it.).
  • Visit the museum of death.
  • Take a beer tour.
  • Visit the ecological park.
  • Eat at a local restaurant.
  • Watch a movie at the cinema.
  • Shopping at the market.
  • Visit the water park.
  • Visit the zoo.
  • Take a trip to the canyon."

The thing is that in that city there is no such museum, no zoo and much less the water park, the nearest canyon is on the other side of the state a few thousand miles away. Apparently the source where he got it from does exist and it is a tourist website -ironic giving wrong information to the traveler...-. Diff

Greetings. --Hispano76 (talk) 01:34, 26 May 2023 (UTC)Reply

I assume the AI can do such things by itself: what points of interest do you usually suggest when somebody is asking? Often you do suggest the zoo and a water park. The AI just doesn't have any idea that they aren't relevant for this destination. Of course, as the AIs develop, they will get a better "understanding" on what is required for those to be suggested (such as that they should exist), but similar errors are to be assumed to happen in the foreseeable future. That's why these attractions should be linked to the official website, where the location and other details can be checked. If the AI cannot provide it, it cannot be trusted –LPfi (talk) 08:48, 26 May 2023 (UTC)Reply
ChatGPT also thought it was possible to travel from FI to DK via AX (also allowing time to explore all three countries) in one day before correcting itself yesterday. SHB2000 (talk | contribs | meta) 08:57, 26 May 2023 (UTC)Reply

AI and Wikivoyage[edit]

Swept in from the pub

Food for thought. I asked ChatGPT to "write a wikivoyage article about la toussuire" (wikipedia:La Toussuire. (This is a French tourist resort missing from WV that a student of mine would like to write about; I've used it as a test case of how useful this would be for us - and for my class). Obviously, I cannot prevent my students from using this tool, and we need to be aware that some will do so - and not just studens, there will be (new?) editors who will try to help like this (or "help"). And I think it is a good learning opportunity for my class, particularly when it comes to stuff like "how to get started with your article", and a reminder to double check all information / verify all factual claims. But as for WV at large, this does suggest AI could be of use to us by generating entries - as long as someone would check them, of course...? Anyway, if anyone would like to criticize the AI entry and be very specific about mistakes it made as relates to WV's manual of style or other policies, do let me know - I'd like to show this example to students in few days. Piotrus (talk) 09:03, 20 September 2023 (UTC)Reply

FTR, the previous discussion about this issue can be found on Wikivoyage talk:Copyright-related issues#Articles vs GPT. --SHB2000 (talk | contribs | meta) 11:12, 20 September 2023 (UTC)Reply
I did not look at the accuracy or style (or the copyright questions raised in the other thread) of the ChatGPT generated article. Facts can be double-checked, and style can be fixed (no different from new pages that need some help, or old pages that have fallen out of date). I don't want to open a new can of worms, but you can even ask ChatGPT in your prompt or in a followup question to use wikivoyage formatting and templates, and then ask for the full markup ready to paste into a new article.
My concern with using AI to generate content is that it contravenes the philosophy of the whole guide. Consider the following:
"Wikivoyagers are travel writers and members of a world-wide community of contributors to Wikivoyage. [...] We are people just like you. Some of us are interested in travelling, some are interested in their local communities, and others are interested in wiki-housekeeping and organisation. What all of us have in common is that we want to share what we know with travellers everywhere. (Wikivoyage:About#Wikivoyagers)(emphasis added)
The value of our travel guide is that it is the product of a community effort to share what we collectively know about places for the sake of travel. It's great not just because it can be up to date and accurate, but there is something human about deciding to share information, and there is something equally human about acting on shared information. Especially in travel.
"Whenever travellers meet each other on the road, they swap info about the places they came from and ask questions about places they're going.(Wikivoyage:About)
Maybe I just enjoy coming home with a good story, but I'll always opt for the suggestion of a fellow traveller in a new place over something that happens to be the top result in any search engine. Presenting AI travel "advice" alongside what Wikivoyagers have written and revised over many years does a disservice to the values that make WV unique. Gregsmi11 (talk) 17:22, 20 September 2023 (UTC)Reply
@Gregsmi11 Fair point, but it does take a "fellow traveller" to make the AI do something. Is it fair to deny them this tool if they can use it responsibly?
That said, I tried the tool again and I think we need to be careful - something I'll stress to my students. For example, a while ago I wrote a guide to Chorzów. I've now asked the AI to do the same. Some of the content I see AI generated seems usable at first glance, but it is certainly important to double check everything. For example, it generated an entry for "Saint Jadwiga church" that contains an error and then some pointless generalities: "This stunning Neo-Gothic church is a notable architectural landmark. Its intricate design and beautiful stained glass windows are worth admiring.". The church is not neo-gothic but Neoromanesque.... It also invented a listing for a non-existent "Stary Browar Chorzów" and likewise for some restaurants that don't seem to exist - seems it started hallucinating. It's sleep section contains an entry on a real hotel followed by a fake one and some gibberish about appartments.
The amount of hallucinations I see gives me a pause, given Wikivoyage does not require references. Piotrus (talk) 10:04, 21 September 2023 (UTC)Reply
It's not fair to deny any tool used responsibly, but I don't think we know what responsible use of AI looks like yet (or at least we haven't written a policy about it). Formatting listings? Brainstorming attractions? Translating and copyediting? Creating basic outline articles where we have gaps in coverage? We've got the hammer, now we just need to invent some nails before we start swinging.
I think ChatGPT is especially risky because it can be very good at making hallucinations sound reliable. Pointless generalities tend to be edited out eventually, but small factual errors can live a very long time if every editor assumes they know less than the original contributor. It would only take a handful of well-meaning users with blind faith in AI before most of the work here turns into factchecking. And thats probably the best case scenario... just wait until an enterprising editor sets up an AI to edit directly. Gregsmi11 (talk) 14:07, 21 September 2023 (UTC)Reply
I don't think ChatGPT has access to the internet. (Maybe I'm wrong?) It might be handy if you could say "Give me the URLs for five free activities in <city>", followed by "Okay, is my favorite of the activities. Now give me a 100-word summary of the activity described on that website." WhatamIdoing (talk) 16:28, 21 September 2023 (UTC)Reply
Isn't there a risk of copyright violation from AI, too? People have to be responsible for their own words. Ikan Kekek (talk) 18:00, 21 September 2023 (UTC)Reply
Depending on the AI. If the wordings are derived from a big mass of text, none quoted verbatim, there is no copyright involved. The copyright laws may change because of the "write/draw like NN" problem, but that shouldn't be an issue for us. On the other hand, an AI can reuse phrases, if trained to do that, and if those are long or original enough, the result will be copyvios. –LPfi (talk) 19:58, 21 September 2023 (UTC)Reply
I think we'd go crazy trying to find the original source of text once the AI has rewritten and paraphrased it anyway. There's also the less legal, more ethical question of whether it is right for someone to claim authorship over text they retrieved from an AI. The user might be OK from a licence standpoint, but can they say they "wrote" something? Gregsmi11 (talk) 20:23, 21 September 2023 (UTC)Reply
Good point. LPfi, uncredited quotes or poor paraphrases don't have to be entirely verbatim to violate copyright. Ikan Kekek (talk) 20:50, 21 September 2023 (UTC)Reply
The US Copyright Office says that AI-generated content is not written by a human. It is therefore ineligible for copyright protection/automatically in the public domain.
It is possible for an LLM to generate something that matches existing copyrightable text, though it's not really an LLM if it just copies and pastes from its source files. I'm sure you've heard about the w:en:Infinite monkey theorem (the version that went around the playground when I was a kid said "an infinite number of monkeys typing on an infinite number of typewriters for an infinite amount of time would produce Shakespeare's plays"); LLMs will "randomly" match text much faster than the monkeys would, because they're not actually random, because there are multiple texts to match with (that's the w:en:Birthday problem), and because we don't need an exact match to have a copyvio.
But on our end, a copyvio gets detected the same way, regardless of whether it is intentional, negligent, or accidental, and whether it was generated by a human or an LLM. WhatamIdoing (talk) 15:43, 22 September 2023 (UTC)Reply
ChatGPT is at least better than Bard, which has outright lied to me several times. --SHB2000 (talk | contribs | meta) 12:36, 22 September 2023 (UTC)Reply
Here's an example of a new article that triggers AI content detection- Gwanggyo. If you've spent a lot of time on ChatGPT, you might recognize its "voice" in large parts of the article. Is this problematic? It's certainly no worse than many other new article creations that need template work and copyediting. A 10-second plagiarism check doesn't reveal anything. I'm having difficulty finding the "Gwanggyo Modern library" online, but I'm searching in English, so it's certainly not evidence of a hallucination. Gregsmi11 (talk) 14:07, 22 September 2023 (UTC)Reply
I used Bard and ChatGPT for travel ideas when I was in Tokyo and Osaka last month. Both are not bad, but definitely underestimate the time for some activities when I asked them to create itineraries for me. I would say ChatGPT is worse because it told to spend 90 mins in Dotonbori for lunch and exploration, but it'll take 25 mins for me to walk from previous destination to Dotonbori. So that leaves me only 60 mins to eat and explore?! And it didn't schedule anything in Japan after 5pm, which completely neglects the nightlife scene. At least Bard named some concrete evening activities (by name) in Tokyo. OhanaUnitedTalk page 14:23, 22 September 2023 (UTC)Reply
Another example, surely doomed for deletion: Sebago Lake. I can only guess this was copied directly from ChatGPT without attempting to use our standard templates or styles. However, the phrase "recreational opportunities for humans" does indeed have a certain ring to it. Gregsmi11 (talk) 18:22, 5 October 2023 (UTC)Reply

Walking tour copyright?[edit]

Swept in from the pub

Many guidebooks describe walking tours or other itineraries with a couple of waypoints. The text in the book is normally copyrighted, and cannot be copied to Wikivoyage word by word. Provided that the text is rewritten, is there any intellectual property law for a sequence of waypoints, or similar? The Millennium Tour is based on a map published by the Stockholm City Museum, which is a public institution. As the tour had no licensing from the author's estate, and is no longer actively hosted, it seems to be fair game. The Stockholm history tour is a composition of a couple of guidebooks and guided tours. I consider to make a Haunted Stockholm Tour, based on a copyrighted book on the topic, which is in turn based on legends and folklore. There is also a commercial tour on the topic. How close can a travel topic be to a copyrighted work? /Yvwv (talk) 03:41, 27 November 2023 (UTC)Reply

As a non-lawyer and speaking for American law, the closest things that come to mind are 1.) the arrangement and selection of quotations can be copyrighted, even if the person assembling those quotations does not own the original copyright on the works being quoted and 2.) sets of plain facts cannot be copyrighted. So if a walking tour is something like the arrangement of quotations, then yes. But if following a linear path that takes you from the shore to inland or from north to south in the most efficient route or that goes from the town's oldest building to its newest one, etc. is just elaborating on a set of facts, then no. As someone who is still not a lawyer by the end of this comment, I'd imagine that tour pathways are generally going to be the latter, but as with most real legal questions, the answer has a lot to do with if someone sues you and how much money he has to afford good lawyers. —Justin (koavf)TCM 04:13, 27 November 2023 (UTC)Reply
I hope expensive lawyers aren't key in Sweden. Finnish law is closer, but neither I am a lawyer, and I haven't researched that aspect of law. If I were inspired by a former tour, I would attribute the original as common-sense courtesy. It is probably best to either just be inspired, not copying, or to ask for permission to revive the tour, although I don't see that the itinerary itself could be copyrighted (as it is about ideas). I assume the situation is similar to copying the plot of a novel without copying the actual writing. –LPfi (talk) 09:37, 27 November 2023 (UTC)Reply
We might also consider business ethics. Professional tour guides normally don't publish their itineraries, so that their knowledge stays useful. Open-source articles on other tours in the same city is however more likely to increase interest for the city, and leave visitors with more money to spare in the city. And in general, Wikimedia projects don't usually limit themselves to avoid rivalry with commercial publishers. /Yvwv (talk) 16:43, 27 November 2023 (UTC)Reply
It might be worth doing some research in a large library (and maybe secondhand bookshops). Although you have identified one copyrighted book on Stockholm's Ghosts, are some of the waypoints also covered by other books? If 3 different books (by different authors) mention the location then good. Maybe there is even an out of copyright book that covers some of the locations (the 1890 guide to Stockholm?). AlasdairW (talk) 23:34, 28 November 2023 (UTC)Reply
After reading Linnell's book, I find it so comprehensive that it will be difficult to find any well-known ghost story in Stockholm that is not mentioned by the book. In any case, much of the quality of folklore is the dramatic storytelling. Linnell does not really deliver that, so over time, the article can hopefully be expanded with ghost legends told in a compelling manner. /Yvwv (talk) 19:42, 29 November 2023 (UTC)Reply
I recommend you to read w:Copyright in compilation. I think a sequence of waypoints is similar to "A directory of the best services in a geographic region", an example of copyrightable things listed in the article.--Hnishy63 (talk) 23:49, 29 November 2023 (UTC)Reply
Thanks for the advice. The book (ISBN 9151827387) in its whole is close to be a complete database. It also suggests a few walking tours, one of them in Gamla stan; the intention of the Haunted Stockholm tour is to make a slightly different itinerary, partially inspired by commercial guided tours. /Yvwv (talk) 01:01, 30 November 2023 (UTC)Reply
Another advice; you may contact the publisher and ask for explicit permission by the author. Tell them the following: 1) I recently found compilations are copyrightable. 2) My itinerary is non-commercial. 3) It uses very small part of your work. 4) It may actually increase sales of the book. 5) I should have requested before publishing my itinerary. You have good chances and anyway can clarify the situation.--Hnishy63 (talk) 22:20, 3 December 2023 (UTC)Reply
Our license requires them to permit commercial use of our text. We are non-commercial, but re-users might not be. WhatamIdoing (talk) 15:01, 4 December 2023 (UTC)Reply
There are plenty of digital sources for Stockholm's ghost stories. I will use other sources than the main book when available, and also add waypoints not found in the book, to avoid copyvio. Hopefully, the article would be good to feature for Halloween 2025, after the 2-year cooldown since the featuring of the Swedish Empire. /Yvwv (talk) 15:14, 5 December 2023 (UTC)Reply